Skip to content

Support Parquet key management tools #7256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
adamreeve opened this issue Mar 9, 2025 · 5 comments
Closed

Support Parquet key management tools #7256

adamreeve opened this issue Mar 9, 2025 · 5 comments
Assignees
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate

Comments

@adamreeve
Copy link
Contributor

Follow up task to #6637 and #7111, which add support for Parquet modular encryption.

Parquet-java, C++ Parquet and PyArrow provide a higher level API on top of Parquet modular encryption to simplify integration with a key management system (KMS) and support encryption best-practices. For compatibility with these implementations and to simplify the use of Parquet encryption, arrow-rs should also support this API.

There's a design document that provides more details on this at https://docs.google.com/document/d/1bEu903840yb95k9q2X-BlsYKuXoygE4VnMDl9xz_zhk/edit?tab=t.0#heading=h.aatf1oymdx11

I have made some progress towards implementing this in a branch (https://github.com/adamreeve/arrow-rs/tree/kmt).

@AudriusButkevicius
Copy link

I guess this is not closed as needs to be weaved in for encryption?

@adamreeve
Copy link
Contributor Author

adamreeve commented Apr 2, 2025

Yes this is not done yet. I should have a PR ready for it soon.

It will be a separate module that builds on top of the recently added encryption feature.

@adamreeve
Copy link
Contributor Author

See #7387 (comment), it was decided this should be a third party crate rather than part of arrow-rs for now, so I'm closing this as not planned. I'll update here with details once a crate is available.

@adamreeve adamreeve closed this as not planned Won't fix, can't repro, duplicate, stale Apr 22, 2025
@alamb alamb added the parquet Changes to the parquet crate label May 8, 2025
@alamb
Copy link
Contributor

alamb commented May 8, 2025

label_issue.py automatically added labels {'parquet'} from #7286

@adamreeve
Copy link
Contributor Author

We (G-Research) have now released a new parquet-key-management crate to implement this functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants