Skip to content

Refactor Parquet DecryptionPropertiesBuilder #7477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 8, 2025

Conversation

adamreeve
Copy link
Contributor

@adamreeve adamreeve commented May 7, 2025

Which issue does this PR close?

Closes #7476

Rationale for this change

Tidies up code to make it more maintainable and also makes invalid uses of the builder (ie. setting column keys directly when using a key retriever) cause a compilation error rather than a runtime error.

#7286 (comment) suggested changing DecryptionPropertiesBuilder to be an enum, but that would require changing DecryptionPropertiesBuilder::with_column_key to return a Result, and I figured it would be better to keep the API mostly source compatible and make it a compilation error to call this when a key retriever is used rather than a runtime error.

What changes are included in this PR?

Refactors how FileDecryptionProperties are built when using a key retriever by adding a new DecryptionPropertiesBuilderWithRetriever type.

Are there any user-facing changes?

Yes, this is a breaking change to the Parquet encryption API, but should be source compatible unless users were directly calling DecryptionPropertiesBuilder::new_with_key_retriever rather than FileDecryptionProperties::with_key_retriever (all tests used the latter method so didn't need to be changed). Parquet encryption is also a recently added experimental feature so I don't imagine this will affect many users.

@github-actions github-actions bot added the parquet Changes to the parquet crate label May 7, 2025
@adamreeve adamreeve requested a review from rok May 7, 2025 04:20
Copy link
Member

@rok rok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much cleaner now.
It would be nice to test this even with a mock key retriever.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @adamreeve -- this looks great.

I agree with your assesment on the API impact -- while it is technically a breaking chance, since the encryption feature is marked as experimental I think this is ok

@alamb
Copy link
Contributor

alamb commented May 7, 2025

Looks to me like there is a conflict that needs to be resolved, but after that it should be good to go

@adamreeve
Copy link
Contributor Author

Looks to me like there is a conflict that needs to be resolved, but after that it should be good to go

This is resolved now, I had to account for the new footer_signature_verification option that was added.

It would be nice to test this even with a mock key retriever.

There are already some tests that cover this, eg:

FileDecryptionProperties::with_key_retriever(Arc::new(key_retriever))

Given this PR is only a refactor I don't think there are any extra tests needed.

@alamb alamb merged commit 1dc9760 into apache:main May 8, 2025
16 checks passed
@alamb
Copy link
Contributor

alamb commented May 8, 2025

Thanks again @adamreeve and @rok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor Parquet DecryptionPropertiesBuilder to fix use of unreachable
3 participants