fix(python): improved xlsx2csv defaults for `read_excel` #12081

alexander-beedie · 2023-10-28T09:45:53Z

Closes #12052 and closes #12054. If the user hasn't explicitly set these options themselves, we should default them to sensible values - otherwise we can silently lose data and/or precision that is available in the target worksheet.

Updated defaults

Applies to "xlsx2csv" engine (so before we read the resulting CSV data with our own parser):

skip_hidden_rows → False,
floatformat → "%f"

(The other two defaults are not a change from the existing behaviour, they just codify the expected values).

Additional fix

read_excel supports more than one engine now, but "read_csv_options" and "xlsx2csv_options" only apply when using the xlsx2csv engine; we now ensure that these options are not mis-specified alongside the openpyxl engine (will think about cleaning this up further to make it more generic later 🤔)

alexander-beedie requested review from ritchie46 and stinodego as code owners October 28, 2023 09:45

github-actions bot added fix Bug fix python Related to Python Polars labels Oct 28, 2023

alexander-beedie force-pushed the read-excel-defaults branch 4 times, most recently from ff2af61 to eafe80d Compare October 28, 2023 11:08

fix(python): improved xlsx2csv defaults for read_excel

d531aa4

alexander-beedie force-pushed the read-excel-defaults branch from eafe80d to d531aa4 Compare October 28, 2023 11:32

additional checks for engine-specific options

5918fc8

ritchie46 approved these changes Oct 28, 2023

View reviewed changes

ritchie46 merged commit 3b6aa8f into pola-rs:main Oct 28, 2023
13 checks passed

alexander-beedie deleted the read-excel-defaults branch October 28, 2023 14:55

alexander-beedie mentioned this pull request Oct 28, 2023

pl.read_excel only reads visible number of decimals #12052

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(python): improved xlsx2csv defaults for `read_excel` #12081

fix(python): improved xlsx2csv defaults for `read_excel` #12081

alexander-beedie commented Oct 28, 2023 •

edited

Loading

fix(python): improved xlsx2csv defaults for read_excel #12081

fix(python): improved xlsx2csv defaults for read_excel #12081

Conversation

alexander-beedie commented Oct 28, 2023 • edited Loading

Updated defaults

Additional fix

fix(python): improved xlsx2csv defaults for `read_excel` #12081

fix(python): improved xlsx2csv defaults for `read_excel` #12081

alexander-beedie commented Oct 28, 2023 •

edited

Loading