Skip to content

Verify footer tags when reading encrypted Parquet files with plaintext footers #7459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

rok
Copy link
Member

@rok rok commented Apr 29, 2025

Which issue does this PR close?

From #7255

Follow up task to #6637, which adds initial support for reading files that use Parquet modular encryption.

The Parquet format allows encrypting some or all column data while keeping footers in plaintext for compatibility with readers that don't support encryption. Readers that support encryption can still verify the integrity of the footer though, as a 28 byte nonce and GCM tag are written after the plaintext footer metadata (see https://github.com/apache/parquet-format/blob/master/Encryption.md#55-plaintext-footer-mode).

This should be supported in arrow-rs to allow readers to verify the integrity of plaintext footers.

This should probably be optional, eg. in C++ Parquet there's a FileDecryptionProperties::Builder::disable_footer_signature_verification method to allow disabling this.

Closes #7255.

Rationale for this change

This adds a mechanism that willl prevent tampering with metadata.

What changes are included in this PR?

This adds a read-time integrity verification of footer metadata of read file.

Are there any user-facing changes?

Users get an opaque integrity verification check by default (will throw if failed) and can choose to opt out by calling FileDecryptionProperties::Builder::disable_footer_signature_verification method.

@github-actions github-actions bot added the parquet Changes to the parquet crate label Apr 29, 2025
@rok rok force-pushed the GH-7255-verify-footer-tags-when-reading-encrypted-parquet-with-plaintext-footer branch 2 times, most recently from e7cef64 to 2ee3c82 Compare April 29, 2025 22:10
@rok rok marked this pull request as ready for review May 1, 2025 14:52
@rok rok force-pushed the GH-7255-verify-footer-tags-when-reading-encrypted-parquet-with-plaintext-footer branch from 7d6d3f7 to e15517f Compare May 1, 2025 19:20
@rok rok force-pushed the GH-7255-verify-footer-tags-when-reading-encrypted-parquet-with-plaintext-footer branch from e15517f to a00efec Compare May 5, 2025 22:26
@rok rok requested review from adamreeve and alamb May 5, 2025 22:44
@rok
Copy link
Member Author

rok commented May 5, 2025

@adamreeve I've addressed your feedback, do you think this is ready for a final pass by @alamb ?

Copy link
Contributor

@adamreeve adamreeve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty good to me although I've got one further comment.

@rok rok force-pushed the GH-7255-verify-footer-tags-when-reading-encrypted-parquet-with-plaintext-footer branch from 87097f5 to e87ccff Compare May 6, 2025 10:47
@rok rok requested a review from adamreeve May 6, 2025 11:05
Copy link
Contributor

@adamreeve adamreeve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks Rok

@rok
Copy link
Member Author

rok commented May 7, 2025

@alamb this and #7439 are ready for review (in this order) for when you can spare some time.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me -- thank you @rok and @adamreeve

@alamb alamb merged commit fb72b8f into apache:main May 7, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Verify footer tags when reading encrypted Parquet files with plaintext footers
3 participants