-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataflow plan for custom granularities #1409
Conversation
889ba84
to
728e2c8
Compare
0c12518
to
a01653f
Compare
a01653f
to
f636c63
Compare
728e2c8
to
3ed4cbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theyostalservice and I are both a bit confused by the conversion operations we're doing on the TimeDimensionSpecs (see inlines for where the confusion arises).
It feels like the custom granularity specs should be checkable via simple equality in their own right, making the conversion to base grain unnecessary. Then when we encounter one we should be able to get the base grain we need to construct the join just at that moment in the DataflowToSQLQueryPlanConverter logic.
If that isn't the case it seems like there's some construct we're missing someplace. It's possible I'm the one missing something, though, so please do let us know. If it's easier to explain in real time let's jump on a call.
for linkable_spec in linkable_specs.as_tuple: | ||
if isinstance(linkable_spec, TimeDimensionSpec) and linkable_spec.time_granularity.is_custom_granularity: | ||
linkable_spec_with_base_grain = linkable_spec.with_grain( | ||
ExpandedTimeGranularity.from_time_granularity(linkable_spec.time_granularity.base_granularity) | ||
) | ||
linkable_specs_set_with_base_granularities.add(linkable_spec_with_base_grain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me why we're doing these granularity transformations here and below (and I think maybe in another place in the overall stack).
730cc49
to
39cf91d
Compare
3ed4cbc
to
85f96c1
Compare
@tlento @theyostalservice we should have a huddle about this one today, if possible! |
linkable_specs_set_with_base_granularities: Set[LinkableInstanceSpec] = set() | ||
# TODO: Add support for no-metrics queries for custom grains without a join (i.e., select directly from time spine). | ||
for linkable_spec in linkable_specs.as_tuple: | ||
if isinstance(linkable_spec, TimeDimensionSpec) and linkable_spec.time_granularity.is_custom_granularity: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: remove type refinement (here and elsewhere)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use helper to get all custom time dims from linkable spec set
@@ -406,6 +407,10 @@ def evaluate_node( | |||
logger.debug(f"Candidate spec set is:\n{mf_pformat(candidate_spec_set)}") | |||
|
|||
data_set_linkable_specs = candidate_spec_set.linkable_specs | |||
# Look for which nodes can satisfy the linkable specs at their base grains. Custom grains will be joined later. | |||
required_linkable_specs_with_base_grains = [ | |||
spec.with_base_grain if isinstance(spec, TimeDimensionSpec) else spec for spec in required_linkable_specs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this function to take a LinkableSpecSet instead of a Sequence
@@ -1020,10 +1044,12 @@ def _find_dataflow_recipe( | |||
metric_time_dimension_reference=self._metric_time_dimension_reference, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this function to do the conversion from custom to base grain, keep that change contained inside it
and remove related logic from elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK! Per earlier discussion:
- We'll move the custom grain -> base grain conversion logic into _find_dataflow_recipe, since that's where it's needed, and remove the inlined bits from the node evaluator and other locations
- We'll use the LinkableSpecSet for managing the relevant type refinement instead of inline isinstance checks
- The calls to do these can have comments explaining why they're happening as appropriate. I think it's most important in the doc block for _find_dataflow_recipe, since we'll be doing these operations at the entry point to that method.
As discussed @courtneyholcomb will make these changes at the top of the stack.
InstanceSpecSet.create_from_specs(required_linkable_specs.as_tuple), | ||
InstanceSpecSet.create_from_specs( | ||
[ | ||
spec.with_base_grain() if isinstance(spec, TimeDimensionSpec) else spec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming this will also get consolidated like the other sections, so it'll be based on the LinkableSpecSet operation rather than this inline type refinement.
This change renders the include_base_grain property useless, so I removed that too.
3050414
to
704e79a
Compare
Based on feedback in #1409, this PR makes some updates to the dataflow plan builder. These are only refactoring changes, there should be no behavioral changes. Changes include renaming some things and simplifying the code related to converting custom granularities to base granularities in source node resolution. Reviewing by commit is recommended - full details in commit descriptions.
Update the
DataflowPlanBuilder
to support custom granularities. Steps included:JoinToCustomGranularityNode
to theDataflowPlan
.