Release Python Polars 0.20.7 · pola-rs/polars

⚠️ Deprecations

Rename threadpool_size to thread_pool_size (#14236)

🚀 Performance improvements

prune parquet row groups when is_not_null is used (#14260)
Avoid unnecessary copies in Series.to_numpy for boolean/temporal types (#14261)
use is_between to skip parquet row groups (#14244)
Use a compression API that is designed for this use case (#11699) (#14194)
Use UnitVec in polars-plan traversal (#14199)
use UnitVec in streaming joins (#14197)
improve ChunkId (#14175)
improve iteration performance (#14126)
elide unneeded work in window? (#14108)
run window functions more in parallel (#14095)
improve skip row group using statistics condition (#14056)

✨ Enhancements

add u8/i8/u16/i16 parsers to CSV reader (#14241)
move F-order data in and out of numpy to polars zero copy (#14259)
read arrow-c-interface without requiring pyarrow (#14254)
Implements list.gather_every (#14253)
Implements prefix/suffix_fields (#14251)
Change Series.to_numpy to return f64 for Int32/UInt32 Series with nulls instead of f32 (#14240)
Polish decimal arithmetic (#14172)
improved read_excel format detection, and support for excel 97-2004 workbooks (#14234)
Introduce arr.to_struct (#14202)
Supports map fields name of struct (#14203)
make IdxVec generic as UnitVec (#14196)
add new arithmetic kernels (#14026)
Supports unique and hash_rows for null column (#14111)
Implement arithmetic operations for Null columns (#14107)
support pd.Index in from_pandas and elsewhere (#14087)
Allow renaming expressions with keyword syntax in group_by (#14071)
raise more informative error message if someone lands on Expr.__bool__ (#14067)
Adapt extend_constant to function expr architecture and expressify it (#14058)
add integer negation (#14049)
list & array measures of dispersion (#13245)
gc binview when writing ipc (#14035)
When calling convert_time_zone on time-zone-naive datetime, convert as if converting from UTC (#13960)

🐞 Bug fixes

deduplicate recursive growables (#14264)
Fix glimpse overload signature (#14258)
allow set operations on list of categoricals (#14110)
any/all_horizontal with single input has incorrect type (#14256)
load numpy array with np array values #14237 (#14238)
Make Series.to_numpy on booleans without nulls return bool type (#14239)
fix ufunc in agg (change __ufunc_array__ so it uses is_elementwise=True parameter) (#14135)
Fix join validation for String types (#14229)
enable windows test coverage for read_excel "calamine" (fastexcel) engine (#14171)
make csv parser more robust to edge cases (#14210)
Fix for set_operations of binary dtype (#14152)
fix read_csv date/datetime inference and parsing (#14113)
don't see files as hive partitions (#14128)
allow eval on list of categoricals (#14132)
Forbid casting from Date to Time and vice versa (#14127)
preserve old naming convention for multi-value pivot (this will change in 1.0 to no longer redundantly have the column name in the middle) (#14120)
Implements gt/lt cmp for null dtype (#14119)
ignore comments at beginning of csv if schema provided (#14115)
fix pivot when multiple columns are passed. Output is now aligned with what tidyverse / pandas.pivot_table would do (#14048)
multiple read_excel updates (#14039)
some temporal conversion errors for datetimes earlier than 1970-01-01 (#14050)
Preserve name when casting from categorical (#14085)
respect Object dtype designation (#14072)
fix cse bug when window function is nested (#14070)
Fix melt panic when there are no value vars (#14057)
json_encode should respect the logical type (#14063)
improve skip row group using statistics condition (#14056)
Raise for .dt.epoch and .dt.timestamp for Duration dtype (#13962)
handle SliceSink with empty data (#14025)
Allow Series.to_pandas for categorical types (#14028)
correct field type schema inference (using read_csv) (#14042)
Use int formatter for unsigned ints (#14043)

📖 Documentation

fix code block in user-guide/lazy/schemas (#14228)
Add visualization page to user guide (#13052)
Fix typo in contributing guide (#14181)
Small improvements Ecosystem page (#14176)
fix code blocks in user-guide/concepts/data-structures (#14146)
Document that Kleene logic is followed in any_horizontal and all_horizontal (#14148)
Fix description of return_dtype parameter for map_elements and map_batches (#14114)
Fix bullet point formatting in CI contributing guide (#14117)
Add documentation on replacement strings to str.replace and str.replace_all (#13382)
Replace alternatives page with more objective comparison (#13784)
Note that only one name operation is allowed per expression (#14075)
Improve deprecation message of dtype_if_empty param (#14068)
fix more docstring bullet points (#14065)

🛠️ Other improvements

Reorganize NumPy interop tests (#14257)
additional dataframe test coverage (#14243)
Remove *args in Series.to_numpy (#14248)
Move metadata utils to meta module (#14230)
remove unused method DataFrame._from_dicts (#14212)
make gather_chunked completely generic (#14195)
Add .cargo directory to .gitignore (#14191)
take_chunked to polars-ops (#14185)
Issue a warning when running doctests on Python 3.11 or lower (#14187)
Run cargo update (#14160)
merge take kernels (#14137)
improve From<Ca> -> Vec (#14123)
hoist boolean -> string cast (#14122)
remove unused argument (#14014)

Thank you to all our contributors for making this release possible!
@JulianCologne, @MarcoGorelli, @Vincenthays, @Wainberg, @alexander-beedie, @apcamargo, @braaannigan, @c-peters, @deanm0000, @dependabot, @dependabot[bot], @dpinol, @edavisau, @eitsupi, @flisky, @grinya007, @ion-elgreco, @itamarst, @lukemanley, @mcrumiller, @orlp, @r-brink, @reswqa, @ritchie46, @stinodego and @taki-mekhalfa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python Polars 0.20.7

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors