Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query/vds] Actually use ref_block_max_length in to_dense_mt #14499

Open
chrisvittal opened this issue Apr 23, 2024 · 0 comments
Open

[query/vds] Actually use ref_block_max_length in to_dense_mt #14499

chrisvittal opened this issue Apr 23, 2024 · 0 comments
Assignees

Comments

@chrisvittal
Copy link
Collaborator

chrisvittal commented Apr 23, 2024

Right now, we perform a full scan in to_dense_mt, we have information to do less work and densify in a single pass.

  • Expose partitioning in python
  • For each partition in the variants table, use ref_block_max_length to determine the full reference interval necessary to densify that partition
  • Use map_partitions of the variants and query_table on the reference to get two streams with all information necessary to densify.
  • Join the streams and use the current algorithm/scan to do the work.
@chrisvittal chrisvittal self-assigned this Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant