-
Notifications
You must be signed in to change notification settings - Fork 924
Add Parquet Modular encryption support (write) #7111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
101 commits
Select commit
Hold shift + click to select a range
11e083e
Start encryption
ggershinsky 0b31469
Work
rok d2a73ef
Work
rok 3b87a04
Pass Encryptor to SerializedRowGroupWriter
rok 5c40e40
Expand test, pass FileEncryptionProperties instead of FileEncryptor t…
rok 6655d21
Add encrypt_object helper
rok 804e903
Implement serialization of column crypto metadata
adamreeve d328396
Fix writing Parquet magic bytes
adamreeve a5ffb49
Add key metadata to file encryption properties
adamreeve ae42dd4
Generate unique file aad and add prefix if set
adamreeve 9d5db21
Add aad param to encrypt_object and don't require TrackedWrite
adamreeve c92559d
Write file crypto metadata
adamreeve bead72a
Set column crypto metadata
adamreeve 27e095b
Store file_aad and aad_file_unique in FileEncryptor
adamreeve c3a3ac7
Work towards using correct AADs
adamreeve 5c04f1c
Fix writing ciphertext length
adamreeve 84b4d50
Add check of ciphertext length
adamreeve a609111
Ugly workaround for setting compressed page size
adamreeve 8c64215
Fix test logic
adamreeve 3a79993
Add page_ordinal
rok 07dc037
Add some feature flags
rok 33ffb97
Add page_ordinal, row_group_ordinal and column_ordinal to SerializedP…
rok 73ee0fa
minor changes
rok 3a867fb
clippy fixes
rok 24e73ed
Encapsulate page encryption context in a PageEncryptor struct
adamreeve dd59b46
SerializedRowGroupWriter.column_index starts at 1 not 0
rok bbd3587
Fix handling dictionary pages and update test
adamreeve 3f43fee
Fix clippy issues
rok 877805a
clippy
rok e390d7b
Use PageEncryptor in ArrowPageWriter
adamreeve 8facaca
Tidy up feature handling and reduce duplication
adamreeve dc0ce9e
Test fixes
adamreeve a14fdf4
Fix setting Arrow page writer for byte typed columns
adamreeve 08958af
WIP Add per-column encryption keys
adamreeve 69f557a
Add test_non_uniform_encryption
rok 8eb841a
lint
rok 96d09a7
lint
rok 69969dc
Add SchemaRef ArrowColumnWriterFactory to get column_path via column_…
rok 65fc410
Get column path from descriptor rather than Arrow schema
adamreeve 3f8907a
Fix writing multiple encrypted pages with ArrowPageWriter
adamreeve 7a4d6c0
Return encryptors as a Result<Box<dyn BlockEncryptor>>
adamreeve 36f3163
Get per-column encryption working and various tidy ups
adamreeve b3f24e3
Handle non-encrypted columns
adamreeve cbf7d74
Tidy up some duplication
adamreeve d862c63
Add encryption_util module for tests
adamreeve a384663
Add uniform encryption test
rok 0f8d6f8
lint
rok 3e7efe4
post rebase
rok 854b257
Check if columns to encrypt are in schema
rok ea2db66
Apply suggestions from code review
rok 2b2a3ef
Move tests to tests/. Post rebase fixes.
rok 22135ba
Review feedback
rok 32673c1
Review feedback
rok f479df1
Minor changes
rok 8ac232b
Raise if writing plaintext footer
rok 75b98cd
Docs for crypto methods
rok 0460424
More practical key API
rok 4032b4f
Refactor PageEncryptor use
adamreeve e2790dd
Simplify with_new_compressed_buffer method
adamreeve 6991943
Apply suggestions from code review
rok b01b285
Review feedback
rok ffd2fc9
Lint and remove redundant test.
rok a522ab1
Docs
rok 70e7190
Add async writer test for encrypted data
rok a4fbaf2
Test struct array encryption, column name with '.'
rok c46a259
Review feedback
rok 3d2054c
First round of changes, add accessors and return result for encryptio…
corwinjoy 4ddbc4c
Add
corwinjoy 159b3df
Update parquet/src/encryption/encrypt.rs
rok ce8d2a9
Move encryption tests
rok 4f15a96
Backout change to arrow/async_reader/mod.rs. TODO, put this in a sepa…
corwinjoy cf57871
Update notes on changes to writer.rs
corwinjoy 8baa777
Merge branch 'encryption-basics-fork' into encryption-basics-fork-pr-cj
corwinjoy 3e02f74
Merge pull request #3 from rok/encryption-basics-fork-pr-cj
corwinjoy 614d5e8
Fix struct array encryption
rok b1db051
Minor fixes
rok ae4c089
Lint
rok 6240644
Fix test
rok df640cd
Add '.' to struct array name
rok cdc3246
Fix required features for encryption tests
adamreeve 05f5982
Fix reading encrypted struct columns and writer test
adamreeve 9811c76
Tidy ups
adamreeve a15816d
Remove unnecessary clone of all row group metadata in unencrypted case
adamreeve 0ccbc03
Remove overly broad error remapping
adamreeve 2a3e905
Tidy up duplicated test function
adamreeve 5a6fac8
Suppress unused mut error
adamreeve 9e79d75
Re-use block encryptors in PageEncryptor
adamreeve b55a1b3
Merge remote-tracking branch 'upstream/main' into encryption-basics-fork
adamreeve 5600e61
Slightly update error message for missing column key.
corwinjoy c7181b8
Refactor PageEncryptor to reduce use of cfg(feature)
adamreeve 6c4d5b3
Refactor PageEncryptor construction in SerializedPageWriter
adamreeve 78ac7b2
Reduce use of inline #[cfg(feature = "encryption")]
adamreeve fd6c30b
Refactor ThriftMetadataWriter to reduce use of feature checks within …
adamreeve 26279eb
Tidy ups
adamreeve 4e9e157
Refactor ArrowRowGroupWriter creation
adamreeve c11dc01
Make pub(crate) more explicit on some structs
adamreeve b9b860b
Check for length mismatch in with_column_keys
adamreeve 754e8bd
Comment and error message tidy ups
adamreeve bfe49c8
Add test to verify column statistics are usable after write
adamreeve 70d3a9b
Merge remote-tracking branch 'apache/main' into encryption-basics-fork
alamb 04a8dad
Merge remote-tracking branch 'apache/main' into encryption-basics-fork
alamb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉