Skip to content

v0.17.0

Compare
Choose a tag to compare
@eitsupi eitsupi released this 04 Jun 03:22

Breaking changes

  • Updated rust-polars to unreleased version (> 0.40.0) (#1104, #1110, #1117, #1124):
    • In $join(), there is a new argument coalesce and the how options now accept "full" instead of "outer" and "outer_coalesce".
    • $top_k() and $bottom_k() gain three arguments nulls_last, maintain_order and multithreaded.
    • All $rolling_*() functions lose the arguments by, closed and warn_if_unsorted. Rolling computations based on by must be made via the corresponding rolling_*_by(), e.g rolling_mean_by() instead of rolling_mean(by =) (#1115).
    • pl$scan_parquet() and pl$read_parquet() gain an argument glob which defaults to TRUE. Set it to FALSE to avoid considering * as a globing pattern.
    • $is_not_nan() on a null value (NA in R) now returns null. Previously, it returned TRUE.
    • In $reshape(), argument dims is renamed dimensions and there is a new argument nested_type specifying if the output should be of type List or Array.
    • In $value_counts(), all arguments must be named and there is a new argument name to specify the name of the output.
    • In all functions accepting optimization parameter (such as projection_pushdown), there is a new parameter cluster_with_columns to combine sequential independent calls to $with_columns().
    • $str$explode() is removed.
    • The check_sorted argument is removed from $rolling() and $group_by_dynamic(). Sortedness is now verified in a quick manner, so this argument is no longer needed (pola-rs/polars#16494).
    • $name$map() stacks on Linux, so this method is deprecated and the document is removed. Please use other methods like <LazyFrame>$rename(<function>) instead (#1123).
  • As warned in v0.16.0, the order of arguments in pl$Series is changed (#1071). The first argument is now name, and the second argument is values.
  • $to_struct() on an Expr is removed. This method is now only available for Series, DataFrame, and in the $list and $arr subnamespaces. For example, pl$col("a", "b", "c")$to_struct() should be replaced with pl$struct(c("a", "b", "c")) (#1092).
  • pl$Struct() now only accepts named inputs and objects of class RPolarsField. For example, pl$Struct(pl$Boolean) doesn't work anymore and should be named like pl$Struct(a = pl$Boolean) (#1053).
  • In $all() and $any(), the argument drop_nulls is renamed ignore_nulls, and this argument must be named (#1050).
  • New method $struct$with_fields() (#1109) and new function pl$field() to be used in expressions in $struct$with_fields() (#1113).
  • New methods for RPolarsDataType: $is_enum(), $is_categorical(), $is_known(), $is_string(), $contains_views(), $contains_categorical() (#1112).
  • In $dt$combine(), the arguments tm and tu are renamed time and time_unit (#1116).
  • The default value of the rechunk argument of pl$concat() is changed from TRUE to FALSE (#1125).
  • In $rename() for LazyFrame and DataFrame, key-value pairs of names are changed to old_name = "new_name" instead of new_name = "old_name" (#1129).
  • In $rename() for LazyFrame and DataFrame, no argument is not allowed (#1129).
  • In all $rolling_*() functions, the arguments center and ddof must be named (#1115).

New features

  • Allow specify a function in $rename() for LazyFrame and DataFrame. They are equivalent to polars.LazyFrame.rename(mapping: Callable[[str], str]) or polars.DataFrame.rename(mapping: Callable[[str], str]) in Python Polars (#1122, #1129).

Full Changelog: v0.16.4...v0.17.0