Skip to content

Commit

Permalink
Update Default Parquet Write Compression (#7692)
Browse files Browse the repository at this point in the history
* update compression default

* fix tests

---------

Co-authored-by: Andrew Lamb <[email protected]>
  • Loading branch information
devinjdangelo and alamb committed Sep 30, 2023
1 parent 85f3578 commit 692ea24
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion datafusion/common/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -307,7 +307,7 @@ config_namespace! {
/// lzo, brotli(level), lz4, zstd(level), and lz4_raw.
/// These values are not case sensitive. If NULL, uses
/// default parquet writer setting
pub compression: Option<String>, default = None
pub compression: Option<String>, default = Some("zstd(3)".into())

/// Sets if dictionary encoding is enabled. If NULL, uses
/// default parquet writer setting
Expand Down
2 changes: 1 addition & 1 deletion datafusion/sqllogictest/test_files/information_schema.slt
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ datafusion.execution.parquet.bloom_filter_enabled false
datafusion.execution.parquet.bloom_filter_fpp NULL
datafusion.execution.parquet.bloom_filter_ndv NULL
datafusion.execution.parquet.column_index_truncate_length NULL
datafusion.execution.parquet.compression NULL
datafusion.execution.parquet.compression zstd(3)
datafusion.execution.parquet.created_by datafusion
datafusion.execution.parquet.data_page_row_count_limit 18446744073709551615
datafusion.execution.parquet.data_pagesize_limit 1048576
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user-guide/configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Environment variables are read during `SessionConfig` initialisation so they mus
| datafusion.execution.parquet.data_pagesize_limit | 1048576 | Sets best effort maximum size of data page in bytes |
| datafusion.execution.parquet.write_batch_size | 1024 | Sets write_batch_size in bytes |
| datafusion.execution.parquet.writer_version | 1.0 | Sets parquet writer version valid values are "1.0" and "2.0" |
| datafusion.execution.parquet.compression | NULL | Sets default parquet compression codec Valid values are: uncompressed, snappy, gzip(level), lzo, brotli(level), lz4, zstd(level), and lz4_raw. These values are not case sensitive. If NULL, uses default parquet writer setting |
| datafusion.execution.parquet.compression | zstd(3) | Sets default parquet compression codec Valid values are: uncompressed, snappy, gzip(level), lzo, brotli(level), lz4, zstd(level), and lz4_raw. These values are not case sensitive. If NULL, uses default parquet writer setting |
| datafusion.execution.parquet.dictionary_enabled | NULL | Sets if dictionary encoding is enabled. If NULL, uses default parquet writer setting |
| datafusion.execution.parquet.dictionary_page_size_limit | 1048576 | Sets best effort maximum dictionary page size, in bytes |
| datafusion.execution.parquet.statistics_enabled | NULL | Sets if statistics are enabled for any column Valid values are: "none", "chunk", and "page" These values are not case sensitive. If NULL, uses default parquet writer setting |
Expand Down

0 comments on commit 692ea24

Please sign in to comment.