-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scan_parquet from io.BytesIO() #10413
Comments
I am pretty sure that this is a duplicate. 🤔 |
It would also be great if scan_* and read_* functions had unified input "type" for files\bytes\etc.. |
My application has Parquet embedded as BLOBs in SQL tables, and processes and combines them lazily. I would love to see support for this - at the moment I have to use |
A similar use case here. We have a bunch of Parquet files in memory I want to work with, without having all of them in memory at the same time. |
I would be very happy with this improvement. I have about a million parquet files stored as binaries in Redis and I want to read them as LazyFrame to save memory space. |
This is still open. Has here been progress? I need this functionality too. |
@HWiese1980 Coincidentally, it was added on main a few hours ago #18532 |
Hah! That's quite the timing! :-D Thanks! |
Can confirm that everything is working. Thanks @coastalwhite! |
Problem description
Add ability to accept io.BytesIO() as source parameter for
scan_parquet
. As for now, it accepts only a path to file/s.This feature may be useful in cases when your program receives parquet through rest API or socket, directly into memory.
The text was updated successfully, but these errors were encountered: