-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add new Enum
categorical data type which allows a fixed set of categories
#11822
Conversation
|
Hope to get to this one end of today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is a WIP but I spotted some minor things on the Python side - figured I might as well leave a comment.
804ae1e
to
ba737a6
Compare
crates/polars-core/src/chunked_array/logical/categorical/builder.rs
Outdated
Show resolved
Hide resolved
Waiting on #12091 to fix failing test that are due to another bug |
cd89d21
to
fbd9d6b
Compare
406277d
to
8b300a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! It's going to make a lot of users very happy.
I left a whole bunch of nitpick comments but overall this looks solid, at least on the Python side (I'll leave the Rust side to Ritchie for now).
We have some ways to go for better integration of this data type such as making sure it works with the interchange protocol __dataframe__
method and probably some others, but we can pick those up as we go along.
crates/polars-core/src/chunked_array/logical/categorical/builder.rs
Outdated
Show resolved
Hide resolved
crates/polars-core/src/chunked_array/logical/categorical/builder.rs
Outdated
Show resolved
Hide resolved
crates/polars-core/src/chunked_array/logical/categorical/builder.rs
Outdated
Show resolved
Hide resolved
crates/polars-core/src/chunked_array/logical/categorical/builder.rs
Outdated
Show resolved
Hide resolved
crates/polars-core/src/chunked_array/logical/categorical/mod.rs
Outdated
Show resolved
Hide resolved
Can you rebase? Then we can get this in! |
1d641cc
to
9268f27
Compare
Lint failure is my fault (dependency updates) - fixing as we speak. Another rebase is probably needed. EDIT: Rebased. |
518e527
to
dc603ce
Compare
Enum
categorical data type which allows a fixed set of categories
Relates to #10705
This PR is meant as a start of a sequence of PRs to improve the categoricals in Polars
This allows users to provide a fixed list of categories when initializing / casting to a categorical
Specifying a value outside of the provided list will create an error
Note that this is in addition to the categorical. The distinction is that categorical types are flexible and new categories get added on the fly while with Enum they are fixed and don't change.
Todo in future PRs
Categorical
with pre-defined categories #10705)