Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-627] Write user specified column to the covering metadata of GeoParquet files #1522

Merged
merged 5 commits into from
Jul 16, 2024

Conversation

Kontinuation
Copy link
Member

Did you read the Contributor Guide?

Is this PR related to a JIRA ticket?

What changes were proposed in this PR?

This patch adds a new option geoparquet.covering.<geometryColumnName> to geoparquet format. If the DataFrame being written has a struct column named bbox which contains the bounding boxes (xmin, ymin, xmax, ymax) of geometries in geometry column, it can be saved as geoparquet as follows:

df.write.format("geoparquet").option("geoparquet.covering.geometry", "bbox").save("/path/to/geoparquet")

The column metadata of geometry column will have the following covering attribute:

"covering": {"bbox": {"xmin": ["bbox", "xmin"], "ymin": ["bbox", "ymin"], "xmax": ["bbox", "xmax"], "ymax": ["bbox", "ymax"]}}

How was this patch tested?

Passing new tests.

Did this PR include necessary documentation updates?

  • Yes, I have updated the documentation.

@Kontinuation Kontinuation marked this pull request as ready for review July 15, 2024 09:37
@jiayuasu
Copy link
Member

Right now, if the user wants to write the covering column, he needs to manually assemble the covering column, which is tedious.

Does it make sense to automatically generate the covering column? I guess this might be hard given the optional Z axis?

Can we at least in the doc explain how to generate the covering column manually (e.g., ST_XMin, ST_YMin, ...)?

@Kontinuation
Copy link
Member Author

This is the first step to support covering column metadata for users already having covering columns in their dataframes and don't need to re-generate those covering columns on writing, we'll add automatic covering column generation support in further PRs.

I've added example code to construct covering columns to the documentation.

@jiayuasu jiayuasu added this to the sedona-1.6.1 milestone Jul 16, 2024
@jiayuasu jiayuasu merged commit d51f92e into apache:master Jul 16, 2024
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants