-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: rename time zone tests #1830
Conversation
if "pyspark" in str(constructor): | ||
data_ = { | ||
col_name: [v.replace(tzinfo=timezone.utc) for v in col_values] | ||
for col_name, col_values in data.items() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is most likely needed for everyone not based in utc time. Since in pyspark_lazy_constructor
we set .config("spark.sql.session.timeZone", "UTC")
, the results will not be the same otherwise.
Context: it took me a bit to realize why instead of hours 12 and 2 I was getting 11 and 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we just put the tzinfo to timezone.utc
when we define data=
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, and all pandas like backends raise warnings/errors sadly
dates = {"a": [datetime(2001, 1, 1), None, datetime(2001, 1, 3)]} | ||
dates = {"a": [datetime(2001, 1, 1), datetime(2001, 1, 3)]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removal of None is due to the fact that otherwise pyspark ends up creating a column of struct type. We might need to address this in the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
π€ yeah might be good to see if we can address the constructor so it doesn't go via pandas, that might in general be good going forwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome thanks!
if "pyspark" in str(constructor): | ||
data_ = { | ||
col_name: [v.replace(tzinfo=timezone.utc) for v in col_values] | ||
for col_name, col_values in data.items() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we just put the tzinfo to timezone.utc
when we define data=
?
dates = {"a": [datetime(2001, 1, 1), None, datetime(2001, 1, 3)]} | ||
dates = {"a": [datetime(2001, 1, 1), datetime(2001, 1, 3)]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
π€ yeah might be good to see if we can address the constructor so it doesn't go via pandas, that might in general be good going forwards
What type of PR is this? (check all applicable)
Checklist