-
Notifications
You must be signed in to change notification settings - Fork 892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add to concat different data types error message the data types #7166
feat: add to concat different data types error message the data types #7166
Conversation
I wonder if we need to incorporate some sort of cardinality limit here, e.g. similar to what we do when printing long arrays. I think this could potentially lead to long error messages, which in turn can lead to application hangs that are hard to diagnose. WDYT? |
arrow-select/src/concat.rs
Outdated
.map(|dt| format!("{dt}")) | ||
.collect::<Vec<_>>(); | ||
|
||
// Only sort in tests to make the error message is deterministic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could just use a BTreeSet? It will be slightly slower, but having non-deterministic error messages I think would be surprising for people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I kept the HashSet but just for tracking unique values, and now the error message have the data type in the order of the input which is deterministic and better so people can get a sense about where the input exists
and also change the data type order to appear in the same order as the arrays for easier debugging
I've added a limit of 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which issue does this PR close?
N/A
Rationale for this change
Better debugging experience
What changes are included in this PR?
Only added the unique data types in the concat message and updated the tests
Are there any user-facing changes?
yes, they will see more helpful error message