-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Datetime(time_unit, time_zone)
and Duration(time_unit)
types
#960
Conversation
@@ -75,13 +75,13 @@ def test_cast_date_datetime_pandas() -> None: | |||
df = df.select(nw.col("a").cast(nw.Datetime)) | |||
result = nw.to_native(df) | |||
expected = pd.DataFrame({"a": [datetime(2020, 1, 1), datetime(2020, 1, 2)]}).astype( | |||
{"a": "timestamp[ns][pyarrow]"} | |||
{"a": "timestamp[us][pyarrow]"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes the default to the polars one
nw.col("date") | ||
.cast(nw.Datetime("ms", time_zone="Europe/Rome")) | ||
.cast(nw.String()) | ||
.str.slice(offset=0, length=19) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
19: number of characters of "2024-01-01 01:00:00"
. The format right after that is different for each backend
narwhals/_pandas_like/utils.py
Outdated
pd_datetime_rgx = ( | ||
r"^datetime64\[(?P<time_unit>ms|us|ns)(?:, (?P<time_zone>[a-zA-Z\/]+))?\]$" | ||
) | ||
pa_datetime_rgx = r"^timestamp\[(?P<time_unit>ms|us|ns)(?:, tz=(?P<time_zone>[a-zA-Z\/]+))?\]\[pyarrow\]$" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try to break these π
narwhals/_pandas_like/utils.py
Outdated
# Pandas does not support "ms" or "us" time units before version 1.5.0 | ||
# Let's overwrite with "ns" | ||
if implementation is Implementation.PANDAS and backend_version < ( | ||
1, | ||
5, | ||
0, | ||
): # pragma: no cover | ||
time_unit = "ns" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can do much else here
@MarcoGorelli any experience on how to locate timezone database at "C:\Users\runneradmin\Downloads\tzdata" ? π |
Datetime
typeDatetime(time_unit, time_zone)
and Duration(time_unit)
types
wow, nice one! i'll try breaking it a bit, but this is amazing, been wanting to do this for a while π |
I think the dtype comparison isn't quite right: In [17]: s
Out[17]:
shape: (1,)
Series: '' [datetime[ΞΌs, Asia/Kathmandu]]
[
2019-12-31 18:15:00 +0545
]
In [18]: nw.from_native(s, allow_series=True).dtype == nw.Datetime('us')
Out[18]: True
In [19]: s.dtype == pl.Datetime('us')
Out[19]: False |
Nice catch! Thanks! Will fix later on π |
π€ interesting, so |
Looks like this might be it π₯³ πΎ can't believe it...this took hours...worth it though. It means we can change the main narwhals namespace, with zero impact on Altair users. as it turns out, the stable api was really worth doing... this is so rewarding πΊ i think the nightly ci failure is unrelated i'll check this again tomorrow, then hopefully we can make a release with this in on Tuesday |
That's awesome! I will take some time this upcoming week to check what happened in detail! However, to use the "edge" dtypes (or in general features), should we + import narwhals as nw
- import narwhals.stable.v1 as nw ? |
narwhals/_pandas_like/utils.py
Outdated
pd_duration_rgx = r"^timedelta64\[(?P<time_unit>ms|us|ns)\]$" | ||
pa_duration_rgx = r"^duration\[(?P<time_unit>ms|us|ns)\]\[pyarrow\]$" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas / pyarrow support 'second' time unit, I think that should be allowed to pass through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just by passing it along or doing manipulation for the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can just pass it through - adding a commit soon
it's fine to use time_unit / time_zone with stable.v1, it's just probably a good idea to avoid anything which calls |
narwhals/_pandas_like/utils.py
Outdated
du_time_unit = getattr(dtype, "time_unit", "us") | ||
return ( | ||
f"duration[{du_time_unit}][pyarrow]" | ||
if dtype_backend == "pyarrow-nullable" | ||
else f"timedelta64[{du_time_unit}]" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to do the same pre-1.5.0 check here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking care of this Marco! But most importantly, this is the cutest gif so far! How did you discover it? |
This is uber cute! |
What type of PR is this? (check all applicable)
Related issues
Checklist
If you have comments or can explain your changes, please do so below.
Introduces time units and time zones in
Datetime
type.