Python Polars 0.20.20
🚀 Performance improvements
- Fix cross join batch size when one of the DataFrames is tiny (#14347)
- Fix binview growable complexity O(n*m) -> O(n) (#15628)
- Remove extra thread spawn from row group fetcher (#15626)
- Use vertical parallelism if input is chunked for
Filter
,Select
,WithColumns
(#15608) - Refactor CSV serialization to not go thorough
AnyValue
(#15576) - don't use dynamic dispatch in visitors (#15607)
- Improve Bitmap construction performance (#15570)
- join by row-encoding (#15559)
✨ Enhancements
- add Expr.dt.add_business_days and Series.dt.add_business_days (#15595)
- Add
str.head
andstr.tail
(#14425) - Add
union
/or
operator forpl.Enum
(#14965) - Extended
BytecodeParser
to handle additional math functions, and imports from the global namespace (#15627) - Push down
is_between
expressions to Arrow (#15180) - add holidays argument to business_day_count (#15580)
- change default to write parquet statistics (#15597)
- Expressify
to_integer
(#15604) - Optimizer; remove double SORT and redundant projections (#15573)
- Add
null_on_oob
parameter toexpr.array.get
(#15426) - support weekend argument in business_day_count (#15544)
- Enable
is_first/last_distinct
for not nested non-numeric list (#15552) - Turn off cse if cache node found (#15554)
- Tag concat list as elementwise (#15545)
🐞 Bug fixes
- Return appropriate data type for time
mean
andmedian
(#14471) - Fix issue in
write_excel
that could lead to incorrect spanning range determination (#15631) - Output correct dtype for
mean_horizontal
on a single column (#15118) - Recompute RowIndex schema after projection pd (#15625)
- Mean of boolean in streaming group_by incorrectly always gave NULL (#15616)
- Include cloud creds in cache key (#15609)
- Fix elementwise-apply if any input is
AggregatedScalar
(#15606) - Explode list should take validity into account (#15572)
- use larger recursive stack in debug mode (#15593)
- SQL interface "off-by-one' indexing error with
GROUP BY
clauses that use position ordinals (#15584) - Enable missing features in polars-time (#15558)
- Handle quoted identifiers when registering CTEs in the SQL engine (#15564)
- Decompress moved out of schema initialization (#15550)
- Turn off cse if cache node found (#15554)
📖 Documentation
- Add legacy CPU install instructions in user guide (#13676)
- Examples for errors (#13724)
- Add docstring examples for reading json (#14481)
- Add security warning in LazyFrame.deserialize() docstring (#15282)
- Various minor updates to User Guide's SQL intro section (#15557)
🛠️ Other improvements
- Replace most deprecated calls with bounded version (#15632)
- use bound api (#15630)
- Initial PyO3 0.21 support (#15622)
- Don't run streaming group-by in partitionable gb (#15611)
- pref(rust!, python): Unify
sort
withSortOptions
andSortMultipleOptions
(#15590) - Set up CodSpeed (#15537)
Thank you to all our contributors for making this release possible!
@CanglongCl, @ChayimFriedman2, @Fokko, @JamesCE2001, @MarcoGorelli, @NedJWestern, @TrevorWinstral, @alexander-beedie, @deanm0000, @douglas-raillard-arm, @eitsupi, @filabrazilska, @i-aki-y, @itamarst, @leoforney, @mcrumiller, @nameexhaustion, @orlp, @ozgrakkurt, @reswqa, @ritchie46 and @stinodego