You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenSphere currently uses a single GET request to load the entire response into memory. Files are similarly loaded entirely into memory, and are additionally limited because we currently store the file in a single IDB key, which further limits the file size to that of a single IDB value (~104MB).
Loading and parsing large files is problematic in that it spikes memory. For configurations such as Electron (which uses file:// URLs rather than IDB storage), it is fairly trivial to crash the application. Instead, we should stream the file from the source.
Network
For network requests, including file://, we should be able to do the following steps:
Make a HEAD request to URL
Check the content-length response header. If it is small enough, we can just load it and run legacy parsers.
If it contains the Accept-Ranges: bytes response header, then we can stream it via range requests. fetch with ReadableStream on Response.body does not buy us much here as the full response is still being loaded into memory even if we parse it as each chunk is pushed through. The resulting API here should still use ReadableStream. Note that this method of streaming is common in video players which support DASH (and maybe also HLS), so it may be possible to use or adapt some of the network logic from something like Google's Shaka Player (which would play nice with the compiler).
For the initial type detection and sample parse for import, tee the stream
We will need to move the parsers from full format parsers (e.g. JSON.parse(response) and document parsing) to streaming parsers. It should be possible to do this in a piecemeal/backwards-compatible manner so that we don't just break third-party parsers (Does the parser support streaming? If not then spool up the whole thing and pass it in, but be wary of file size so we don't crash). We already have streaming JSON/XML "parsers" used by file type detection (oboe and xml-lexer).
Note: API requests such as WMS/WMTS/WFS may not have support for byte ranges and as such may benefit from fetch/ReadableStream over just xhr GET.
Note: This demo makes use of fetch/ReadableStream without spooling up all the bytes of the response (so that may be the way to go if that's possible).
Warning: the other thing to be careful of here is browser support for ReadableStream (which should be decent). However, some of the transform streams like TextDecoderStream aren't implemented in current Firefox, so polyfills for those will be needed.
File
For files loaded from disk (but not in Electron because that just resorts to file:// URLs), the native File should be streamable (if not with a ReadableStream implementation then with Blob.slice()). However, the biggest issue there is that when the application restarts, we no longer have access to that File instance without the user going to pick it from the file browser again. That's why we currently dump files into IDB.
Some strawman options. We are definitely open to suggestions here:
Stop using IDB file storage. Files become usable in the current session only. Offer to upload the files and use a URL if you want to keep it between sessions?
Expand IDB file storage to multiple keys (which moves the limit from IDB single value size to total available IDB size)
A hybrid approach where we continue to store "smaller" files and stream in larger ones but do not save the larger ones to storage
The text was updated successfully, but these errors were encountered:
OpenSphere currently uses a single
GET
request to load the entire response into memory. Files are similarly loaded entirely into memory, and are additionally limited because we currently store the file in a single IDB key, which further limits the file size to that of a single IDB value (~104MB).Loading and parsing large files is problematic in that it spikes memory. For configurations such as Electron (which uses
file://
URLs rather than IDB storage), it is fairly trivial to crash the application. Instead, we should stream the file from the source.Network
For network requests, including
file://
, we should be able to do the following steps:content-length
response header. If it is small enough, we can just load it and run legacy parsers.Accept-Ranges: bytes
response header, then we can stream it via range requests.fetch
withReadableStream
onResponse.body
does not buy us much here as the full response is still being loaded into memory even if we parse it as each chunk is pushed through. The resulting API here should still useReadableStream
. Note that this method of streaming is common in video players which support DASH (and maybe also HLS), so it may be possible to use or adapt some of the network logic from something like Google's Shaka Player (which would play nice with the compiler).We will need to move the parsers from full format parsers (e.g.
JSON.parse(response)
and document parsing) to streaming parsers. It should be possible to do this in a piecemeal/backwards-compatible manner so that we don't just break third-party parsers (Does the parser support streaming? If not then spool up the whole thing and pass it in, but be wary of file size so we don't crash). We already have streaming JSON/XML "parsers" used by file type detection (oboe and xml-lexer).Note: API requests such as WMS/WMTS/WFS may not have support for byte ranges and as such may benefit from
fetch
/ReadableStream
over just xhrGET
.Note: This demo makes use of
fetch
/ReadableStream
without spooling up all the bytes of the response (so that may be the way to go if that's possible).Warning: the other thing to be careful of here is browser support for
ReadableStream
(which should be decent). However, some of the transform streams likeTextDecoderStream
aren't implemented in current Firefox, so polyfills for those will be needed.File
For files loaded from disk (but not in Electron because that just resorts to
file://
URLs), the nativeFile
should be streamable (if not with aReadableStream
implementation then withBlob.slice()
). However, the biggest issue there is that when the application restarts, we no longer have access to thatFile
instance without the user going to pick it from the file browser again. That's why we currently dump files into IDB.Some strawman options. We are definitely open to suggestions here:
The text was updated successfully, but these errors were encountered: