Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: write_parquet no longer supports compression options #7433

Closed
andygrove opened this issue Aug 28, 2023 · 1 comment · Fixed by #7435
Closed

Regression: write_parquet no longer supports compression options #7433

andygrove opened this issue Aug 28, 2023 · 1 comment · Fixed by #7435
Labels
bug Something isn't working

Comments

@andygrove
Copy link
Member

Describe the bug

After upgrading the DataFusion Python bindings to version 30, I saw some test failures for writing compressed Parquet files.

apache/datafusion-python#464 (comment)

It looks like this feature was removed from DataFusion in this commit. Note the comment // TODO implement options.

To Reproduce

Try writing a parquet file and specify a compression level in the writer properties:

        let writer_properties = WriterProperties::builder()
            .set_compression(compression_type)
            .build();

Expected behavior

The option should not be ignored.

Additional context

No response

@devinjdangelo
Copy link
Contributor

I am working on adding support back in for these options in #7390 and #7435. The API will still be slightly different after these PRs though, requiring an additional argument. Apologies for not flagging these breaking changes more clearly prior to the 30.0.0 release!

I discussed a possible workaround in the meantime in #7423 to use SessionContext::write_parquet which still uses the previous write implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants