-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation of cross-project ref #7276
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
c445ec0
to
0daea9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to break up this PR?
I'm not certain how to review a 7k+ line PR.
@@ -640,6 +653,9 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin): | |||
source_patches: MutableMapping[SourceKey, SourcePatch] = field(default_factory=dict) | |||
disabled: MutableMapping[str, List[GraphMemberNode]] = field(default_factory=dict) | |||
env_vars: MutableMapping[str, str] = field(default_factory=dict) | |||
public_nodes: MutableMapping[str, PublicModel] = field(default_factory=dict) | |||
project_dependencies: Optional[ProjectDependencies] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for follow-up consideration: potentially should be moved to RuntimeConfig. Think it could be addressed as part of #7469
for unique_id in publication.public_model_ids: | ||
for child_id in manifest.child_map[unique_id]: | ||
node = manifest.expect(child_id) | ||
if hasattr(node.depends_on, "public_nodes"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not blocking, for my own understanding - which nodes dont have depends_on.public_nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SeedNodes, Macro, SourceDefinition
# Load the dependencies from the dependencies.yml file | ||
dependencies_filepath = resolve_path_from_base( | ||
DEPENDENCIES_FILE_NAME, self.root_project.project_root | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking: we may need to rework this to implement #7484
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I do wonder if moving away from "get_full_manifest" to more directly constructing the ManifestLoader might be better if we're going to want to pass a lot of things in to get_full_manifest.
@aranke - Technically this could be split up into (1) writing the publication artifact and (2) reading it and using it to resolve references. However most of the code changes are coming from autogenerated code (event types (~1k), new manifest version (~5k)). So splitting it up into those functional components wouldn't help much to reduce the size of the PR. Also, seeing both parts has been personally helpful as a reviewer as they inform each others implementation. It is still a large and foundational PR for follow-on tasks. @gshank, perhaps a high-level overview of which files / methods are most pertinent for each side of things (writing vs reading + resolving) in the PR description would be helpful for reviewers. |
@aranke There's a video walkthrough of the pr that's linked in the language channel. |
I added some more information on what changes are mostly noise and which are pertinent. |
Added one more follow-on item to consider graph selection / ls: #7496 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. This is a big lift and covers the majority of our considerations for cross-project refs in dbt-core + sets a nice foundation for the identified follow-on work. Thank you for the all the in-depth spiking / exploration work leading up to this! 🚀
resolves #7227
Description
Initial implementation of cross project refs, including writing out publication artifacts, ingesting publication artifacts from other projects, constructing public nodes and resolving references to public nodes at runtime.
Checklist
changie new
to create a changelog entryThere's a bunch of unavoidable noise in this pr which can be skipped over for code review purposes.
The new publication objects are in core/dbt/contracts/publication.py. There's a new Protocol to make a ModelNode and a ManifestNode look the same to the lookup code, plus the addition of "public_nodes" to DependsOn which is in core/dbt/contracts/graph/nodes.py. There are changes to fields in the Manifest and WritableManifest in core/dbt/contracts/graph/manifest.py. There are updates to the RefResolver code in core/dbt/context/providers.py which resolves the public refs at execution time.
Most of the functional changes are in core/dbt/parser/manifest, including "build_public_nodes" and "write_publication_artifact".