Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement python record batch reader #637

Merged
merged 7 commits into from
May 15, 2024
Merged

Conversation

kylebarron
Copy link
Member

@kylebarron kylebarron commented May 15, 2024

Change list

  • Adds Python RecordBatchReader class to store a stream of Arrow record batches
  • Implements __arrow_c_stream__ for exporting a Rust stream of record batches to Python
  • Implements FromPyObject to read from an __arrow_c_stream__
  • Updates geozero-based writers to read from the stream instead of from a materialized table
  • Updates writing Arrow IPC and IPC Stream to not materialize the table

Updating writing GeoParquet from a stream instead of a table is left for future work.

Closes #633

@kylebarron kylebarron enabled auto-merge (squash) May 15, 2024 03:24
@kylebarron kylebarron merged commit 4a4fd78 into main May 15, 2024
6 checks passed
@kylebarron kylebarron deleted the kyle/py-record-batch-reader branch May 15, 2024 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Python FFI bindings for RecordBatchReader
1 participant