Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix join on arrays of unhashable types and allow hash join on all types supported at run-time #13388

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

findepi
Copy link
Member

@findepi findepi commented Nov 13, 2024

Which issue does this PR close?

None

Rationale for this change

fix potential query failure due to incorrect can_hash result

What changes are included in this PR?

  • update can_hash
  • cleanup compute_hashes

Are these changes tested?

Yes, unit tests.
query-based tests are hard to write -- it is unclear how to construct Union or RunEndEncoded types in SQL

Are there any user-facing changes?

yes

The `downcast_primitive_array!` macro handles all primitive types
and only then delegates to fallbacks. It handles Decimal128 and
Decimal256 internally.
@github-actions github-actions bot added logical-expr Logical plan and expressions common Related to common crate labels Nov 13, 2024
@alamb
Copy link
Contributor

alamb commented Nov 13, 2024

query-based tests are hard to write -- it is unclear how to construct Union or RunEndEncoded types in SQL

One way to do so would be via arrow_cast

However, those types don't seem to be supported via arrow_cast yet:

> select arrow_cast(1, 'Union(Union))');
Error during planning: Unsupported type 'Union(Union))'. Must be a supported arrow type name such as 'Int32' or 'Timestamp(Nanosecond, None)'. Error unrecognized word: Union

@alamb
Copy link
Contributor

alamb commented Nov 13, 2024

The other way to do it is to programatically setup the sqllogictest, as is done for map.slt (which didn't have any way to create MapArrays when it was written):

@alamb alamb marked this pull request as draft November 15, 2024 18:45
@alamb
Copy link
Contributor

alamb commented Nov 15, 2024

Converting to draft as I think it is waiting on some tests. Please let me know if that isn't right and this is ready for another review

@findepi findepi changed the title Fix join on arrays of unhashable types Fix join on arrays of unhashable types and allow hash join on all types supported at run-time Nov 16, 2024
@findepi
Copy link
Member Author

findepi commented Nov 16, 2024

This change has two aspects

  1. fixing hash joins on types that are not hashable, but can_hash considered them hashable. This is eg List of something unhashable. I couldn't write a test for this, since types such as listview or ree hit other problems.
  2. unblocking hash joins on types that are actually hashable, but can_hash considered them not hashable. This is not as edge-case'y, for example binary is hashable, but we didn't use hash joins when joining on binary. I added a plan test for this.

@findepi findepi marked this pull request as ready for review November 16, 2024 17:17
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants