Introduce MetricsMaxInferredColumnDefaultsStrategy #13039
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
These changes address issue #11253 allowing for setting of a new default strategy that considers the total number of field metrics rather than just the number of top level columns.
the new property is
write.metadata.metrics.max-inferred-column-defaults.strategy
and valid values would be
original, depth, breadth
It currently preserves the original default behavior as changing
that may be a more disruptive change as it could lead to unexpected
performance regressions. A Breadth first strategy would likely be most
compatible with the original strategy so it would be safer to default
into vs the depth strategy. The original strategy could then be
deprecated and removed in the future
This could also easily support a previously discussed feature of
reversing order of field ids for considering defaults. Though that
won't be included in this PR
I'm inclined to remove the depth strategy unless there is a strong
desire to keep it.