-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spark-acid incorrectly reads/writes pre-Gregorian timestamps #95
Comments
thanks @bersprockets we will take a look at it. |
An update: On write, spark-acid does not write post-start-of-Gregorian calendar values correctly either: In Spark:
Then in beeline
If you run everything in UTC, you might not see this issue with post-start-of-Gregorian calendar values (you'll still see issues with pre-Gregorian values). |
@bersprockets i ran few queries to see the behavior for non-ACID tables and it is consistent with ACID tables atleast on 2.4.3 for Pre-Gregorian timestamps i.e., Spark in general is not giving Julian time stamps atleast in 2.4.3. Following is the non-acid Hive table created below: `0: jdbc:hive2://0.0.0.0:10001/default> create table ts_normal (ts TIMESTAMP) stored as orc; 0: jdbc:hive2://0.0.0.0:10001/default> insert into ts_normal values ('1200-01-01 00:00:00.0'); 0: jdbc:hive2://0.0.0.0:10001/default> select ts from ts_normal; INFO : OK +------------------------+ In Spark: `scala> spark.sql("select ts from ts_normal").show(false) scala>` So I don't think this has to do anything with DataSource in specific. Even if you take a look at the unix timestamp it is stored as same: Spark: Hive:
I have not checked it against Spark 3.0.0 as we have not yet upgraded DS for it, so not sure if this issue would have got fixed with: https://issues.apache.org/jira/browse/SPARK-31557. Are you seeing different behavior for non-ACID tables ? |
In beeline:
In Spark:
Conversely, in Spark:
Then, in beeline:
The text was updated successfully, but these errors were encountered: