Skip to content

V3: Introduce timestamp_ns and timestamptz_ns #1632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 23, 2025
Merged

Conversation

sungwy
Copy link
Collaborator

@sungwy sungwy commented Feb 10, 2025

Fixes: #1552

  • Add TimestampNanoType and TimestampTzNanoType
  • Add Readers and Writers
  • Enhance Transforms
  • Add String Expressions parsing for nanoseconds timestamps
  • Add format-version compatibility check for each type
  • Run compatibility check on TableMetadata creation
  • Unit tests

python native datetime module does not have support for nanoseconds. We'll need to update our internal date time representations to use a different library. numpy? arrow?

@Fokko
Copy link
Contributor

Fokko commented Feb 10, 2025

This is looking great @sungwy 🥳

python native datetime module does not have support for nanoseconds. We'll need to update our internal date time representations to use a different library. numpy? arrow?

Ideally, we want to use int as much as possible.

@sungwy
Copy link
Collaborator Author

sungwy commented Feb 24, 2025

I think it would make sense to break this PR up into two separate items:

  1. one to introduce timestamp_ns and timestamptz_ns support
  2. one to introduce parsing NS timestamps from string expressions with nano precision timestamp

This PR will focus on item (1)

@sungwy sungwy requested a review from Fokko February 27, 2025 03:44
@sungwy sungwy marked this pull request as ready for review February 27, 2025 12:19
@sungwy sungwy requested a review from Fokko March 15, 2025 20:30
Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! Looks great, just have a few comments

@Fokko
Copy link
Contributor

Fokko commented Mar 19, 2025

Looks like there is some issue with the UnknownType, we need to pass in the version there for the create table:

       tbl = catalog.create_table(
            identifier,
            schema=arrow_table.schema,
+           properties={"format-version": "3"},
        )

in test_reads.py:1008.

I think this PR is ready to go. There are some parts we need to double-check, for example, the downcasting of micros to nanos, and that we don't have to fail for V3 when that isn't set. But since we don't yet support producing V3 metadata, I think that's okay.

@Fokko Fokko mentioned this pull request Mar 20, 2025
14 tasks
@sungwy
Copy link
Collaborator Author

sungwy commented Mar 21, 2025

V3 Tracking issue #1818

I removed this integration test because we can't write V3 metadata yet

@Fokko Fokko requested a review from kevinjqliu March 21, 2025 18:43
Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Fokko Fokko merged commit 9d19ef7 into apache:main Mar 23, 2025
7 checks passed
@Fokko
Copy link
Contributor

Fokko commented Mar 23, 2025

Thanks for working on this @sungwy, and thanks for the reviews @smaheshwar-pltr and @kevinjqliu 🙌

@kevinjqliu kevinjqliu added the V3 label Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add V3 types timestamp_ns and timestamptz_ns
4 participants