You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#381 has some great observations on the sequential nature of the HTTP range requests. We have a largish parquet file ~500MB, with 34 row-groups. It takes less time to download the whole file, than to perform a sequential reads on a small subset of columns. Granted less data is being downloaded in case of range reads. I could not find a dedicated open issue for asynchronous read.
... we're (not yet) fully async during I/O. This has a few far reaching requirements for the query execution model that haven't been been tackled yet.
Right now, we're always sitting in a C++ callstack when doing I/O which restricts us to single blocking http reads (via XHR).
Threads would offer an escape hatch here but they're immediately bringing up the problems with SharedArrayBuffers and cross-origin-isolation.
I'd love to implement the web filesystem using multiple concurrent fetches but that's not quite possible today.
#381 has some great observations on the sequential nature of the HTTP range requests. We have a largish parquet file ~500MB, with 34 row-groups. It takes less time to download the whole file, than to perform a sequential reads on a small subset of columns. Granted less data is being downloaded in case of range reads. I could not find a dedicated open issue for asynchronous read.
Originally posted by @ankoh in #381 (comment)
The text was updated successfully, but these errors were encountered: