Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encryption basics #1

Draft
wants to merge 73 commits into
base: main
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
9c2ee0e
first commit
ggershinsky Mar 21, 2024
cd05623
Use ParquetMetaDataReader
rok Nov 23, 2024
a63b375
Fix CI
rok Nov 23, 2024
89aab9f
test
rok Dec 3, 2024
5412f89
save progress
rok Dec 11, 2024
36a2e3b
work
rok Dec 16, 2024
39f16df
Review feedback
rok Dec 17, 2024
513425f
page decompression issue
rok Dec 17, 2024
d2e67b0
add update_aad
rok Dec 17, 2024
b5036a4
Change encrypt and decrypt to return Results
adamreeve Dec 17, 2024
8e9b539
Use correct page ordinal and module type in AADs
adamreeve Dec 18, 2024
9d09a61
Tidy up ordinal types
adamreeve Dec 18, 2024
8e3fbda
Lint
rok Dec 18, 2024
c09f9bd
Fix regular deserialization path
rok Dec 18, 2024
48cc9ae
cleaning
rok Dec 18, 2024
e96e519
Update data checks in test
adamreeve Dec 19, 2024
ba4f5b3
start non-uniform decryption
rok Dec 19, 2024
e3d6b7b
Add missing doc comments
adamreeve Dec 19, 2024
82cfca7
Make encryption an optional feature
adamreeve Dec 20, 2024
5c5d8d9
Handle when a file is encrypted but encryption is disabled or no decr…
adamreeve Dec 20, 2024
cec60c8
Allow for plaintext footer
rok Dec 22, 2024
baeef93
work
rok Dec 23, 2024
6638e12
Fix method name
adamreeve Dec 22, 2024
bec2f5d
work
rok Jan 4, 2025
afe0f36
Minor
rok Jan 6, 2025
a3d3911
work
rok Jan 7, 2025
1ef5dff
work
rok Jan 9, 2025
bbaef12
work
rok Jan 20, 2025
cb91a21
Fix reading to end of file
adamreeve Jan 21, 2025
5b72569
Refactor tests
adamreeve Jan 21, 2025
322b4b7
Fix non-uniform encryption configuration
adamreeve Jan 21, 2025
9947f6c
Don't use footer key for non-encrypted columns
adamreeve Jan 21, 2025
6c9f5c5
Rebase and cleanup
rok Jan 21, 2025
782cd85
Cleanup
rok Jan 21, 2025
5ead306
Cleanup
rok Jan 21, 2025
d70f44e
Cleanup
rok Jan 21, 2025
8a8d99f
Cleanup
rok Jan 21, 2025
2e52d13
Cleanup
rok Jan 21, 2025
c2497e6
Cleanup
rok Jan 21, 2025
951f2fa
lint
rok Jan 21, 2025
e6006bf
Remove encryption setup
rok Jan 22, 2025
669df5e
Fix building with ring on wasm
rok Jan 22, 2025
27f4112
file_decryptor into a seperate module
rok Jan 22, 2025
6acb984
lint
rok Jan 22, 2025
c4860da
FileDecryptionProperties should have at least one key
rok Jan 22, 2025
23375d1
Move cyphertext reading into decryptor
rok Jan 23, 2025
d44c409
More tidy up of footer key handling
adamreeve Jan 23, 2025
5e8394c
Get column decryptors as RingGcmBlockDecryptor
adamreeve Jan 23, 2025
11d2037
Use Arc<dyn BlockDecryptor>
adamreeve Jan 24, 2025
06cfe65
Fix file metadata tests
adamreeve Jan 24, 2025
177d826
Handle reading plaintext footer files without decryption properties
adamreeve Jan 24, 2025
0998b13
Split up encryption modules further
adamreeve Jan 24, 2025
6de6c35
Error instead of panic for AES-GCM-CTR
adamreeve Jan 24, 2025
925d86f
load_async
rok Feb 5, 2025
729603a
new_with_options
rok Feb 5, 2025
05a06f7
Add tests
rok Feb 5, 2025
83fd6ca
get_metadata
rok Feb 5, 2025
45e1007
Add CryptoContext to async_reader
rok Feb 7, 2025
48ebfda
Add row_group_ordinal to InMemoryRowGroup
rok Feb 7, 2025
0b55c8d
Adjust docstrings
rok Feb 7, 2025
b960466
Apply suggestions from code review
rok Feb 10, 2025
b18b099
Review feedback
rok Feb 10, 2025
5a82128
move file_decryption_properties into ArrowReaderOptions
rok Feb 10, 2025
df99b9a
make create_page_aad method of CryptoContext
rok Feb 10, 2025
f411465
make create_page_aad method of CryptoContext
rok Feb 10, 2025
2085557
first commit
ggershinsky Mar 21, 2024
2bd3069
Add FileEncryptionProperties to WriterProperties
rok Jan 24, 2025
b6ef006
Update lib.rs
corwinjoy Jan 30, 2025
5de0a55
Update paths. Add notes on where encryption is needed.
corwinjoy Feb 1, 2025
27c578b
Attempt to add decryption to ArrowReaderMetadata::load_async
corwinjoy Feb 4, 2025
f6cd160
Update async_reader with latest from Rok
corwinjoy Feb 6, 2025
1b385c8
Add changes needed to access encryption from datafusion.
corwinjoy Feb 7, 2025
8f94e2c
post rebase
rok Feb 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Don't use footer key for non-encrypted columns
adamreeve authored and rok committed Jan 21, 2025
commit 9947f6cbd2add8a8fb5d8e0c95f5fad28ada3b4a
26 changes: 15 additions & 11 deletions parquet/src/arrow/arrow_reader/mod.rs
Original file line number Diff line number Diff line change
@@ -716,18 +716,22 @@ impl<T: ChunkReader + 'static> Iterator for ReaderPageIterator<T> {
.schema_descr()
.column(self.column_idx);

let file_decryptor = self
.metadata
.file_decryptor()
.clone()
.unwrap()
.get_column_decryptor(column_name.name().as_bytes());
let data_decryptor = Arc::new(file_decryptor.clone());
let metadata_decryptor = Arc::new(file_decryptor.clone());
if self.metadata.file_decryptor().as_ref().unwrap().is_column_encrypted(column_name.name().as_bytes()) {
let file_decryptor = self
.metadata
.file_decryptor()
.clone()
.unwrap()
.get_column_decryptor(column_name.name().as_bytes());
let data_decryptor = Arc::new(file_decryptor.clone());
let metadata_decryptor = Arc::new(file_decryptor.clone());

let crypto_context =
CryptoContext::new(rg_idx, self.column_idx, data_decryptor, metadata_decryptor);
Some(Arc::new(crypto_context))
let crypto_context =
CryptoContext::new(rg_idx, self.column_idx, data_decryptor, metadata_decryptor);
Some(Arc::new(crypto_context))
} else {
None
}
} else {
None
};
5 changes: 5 additions & 0 deletions parquet/src/encryption/ciphers.rs
Original file line number Diff line number Diff line change
@@ -377,6 +377,11 @@ impl FileDecryptor {
pub(crate) fn has_footer_key(&self) -> bool {
self.decryption_properties.has_footer_key()
}

pub(crate) fn is_column_encrypted(&self, column_name: &[u8]) -> bool {
// Column is encrypted if either uniform encryption is used or an encryption key is set for the column
self.decryption_properties.column_keys.is_none() || self.has_column_key(column_name)
}
}

#[derive(Debug, Clone)]