You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Being able to represent the whole file (1.4MiB for the map) in a single struct is very pleasing 👍. I ran into a couple of things that already have pre-existing issues, but I found workarounds, and overall I'm really happy with how the conversion from encoding/binary turned out for this code.
Now I'm hoping to convert some code that processes image data. The most extreme example I have is a 400MiB file of animations, each of which is RLE-encoded. It's currently processed with encoding/binary and a lot of special-case code like this: https://code.ur.gs/lupine/ordoor/src/branch/master/internal/data/object.go
Even with modern computers, holding the whole 400MiB file in RAM at once isn't great - my low-end personal laptop has 2GiB RAM, and if I read + decode the whole file, it might take up 1GiB all by itself, when I'm only interested in a tiny amount of the total. The original game had system requirements of 16MiB RAM 😅.
To convert this file to struc, I need to be able to specify that the data should be lazily read, which is what happens at the moment with the LoadObjectLazily method.
What do you think to an approach like:
typeSpritestruct {
// ... various header fieldsCompressedSizeint`struc:"uint32,sizeof=Data"`// More headersData io.Reader`struc:"lazy"`
}
The way I see this working is that Unpack* starts to take a ReadSeeker instead of a Reader. It can use that to build objects that will seek-then-read the data when used, and populate the struct with them. For backward compatibility, it could continue to take a Reader, but try to upgrade to a ReadSeeker if a field like this exists.
The readers would only be usable for as long as f is valid, and it would be up to the caller to arrange that. If we never read from sprite.Data, then it's never pulled into memory.
Even more ideally, I'd be able to tell struc to automatically do the "wrap it in RLE encoding" bit, but that might be too ambitious 😅
The lazy part could also be useful for slice members generally, although we'd need to provide some way to prompt them to be filled. Maybe one for a follow-up.
On Pack*, we'd io.Copy data from the provided io.Reader into the file, then write the number of bytes into CompressedSize. This allows for efficient serialization, and if you don't read from a particular reader, the unpack-then-pack cycle is easy.
If you do read the values, you have to remember to replace the reader with one that has the content you read, which isn't totally ideal. A rewindable reader of some kind would paper over that, but maybe overcomplicate matters.
What do you think to the idea? Would you be interested in a PR implementing this behaviour, or is it too much of a change to how struc operates?
The text was updated successfully, but these errors were encountered:
You should be able to do this with a custom field type without any struc modifications, if you’re confident you will only ever unpack a valid ReaderAt + Seeker.
At this point in the custom type:
https://github.com/lunixbochs/struc/blob/master/custom_test.go#L31
You can maybe cast the io.Reader to an io.ReadSeeker here, Seek(0) to get the current position (https://stackoverflow.com/a/10901436), then convert the stream into a SectionReader at the current position with the field size and store it on the field, then seek forward by the field size on the ReadSeeker (so the parser will advance without actually reading the bytes).
Then when you want to read the field, have a Get() []byte method that just ioutil.ReadAll()s the field’s SectionReader.
You should also consider what this means for Packing. I’m assuming the trivial implementation of Pack for this field type would just be a deliberate panic.
Hey,
I'm handling some legacy game data using
struc
, and it's awesome :) - here's an example of usage: https://code.ur.gs/lupine/ordoor/src/branch/master/internal/maps/maps.goBeing able to represent the whole file (1.4MiB for the map) in a single struct is very pleasing 👍. I ran into a couple of things that already have pre-existing issues, but I found workarounds, and overall I'm really happy with how the conversion from
encoding/binary
turned out for this code.Now I'm hoping to convert some code that processes image data. The most extreme example I have is a 400MiB file of animations, each of which is RLE-encoded. It's currently processed with
encoding/binary
and a lot of special-case code like this: https://code.ur.gs/lupine/ordoor/src/branch/master/internal/data/object.goEven with modern computers, holding the whole 400MiB file in RAM at once isn't great - my low-end personal laptop has 2GiB RAM, and if I read + decode the whole file, it might take up 1GiB all by itself, when I'm only interested in a tiny amount of the total. The original game had system requirements of 16MiB RAM 😅.
To convert this file to struc, I need to be able to specify that the data should be lazily read, which is what happens at the moment with the
LoadObjectLazily
method.What do you think to an approach like:
?
Using it would look like:
The way I see this working is that
Unpack*
starts to take aReadSeeker
instead of aReader
. It can use that to build objects that will seek-then-read the data when used, and populate the struct with them. For backward compatibility, it could continue to take aReader
, but try to upgrade to aReadSeeker
if a field like this exists.The readers would only be usable for as long as
f
is valid, and it would be up to the caller to arrange that. If we never read fromsprite.Data
, then it's never pulled into memory.Even more ideally, I'd be able to tell struc to automatically do the "wrap it in RLE encoding" bit, but that might be too ambitious 😅
The
lazy
part could also be useful for slice members generally, although we'd need to provide some way to prompt them to be filled. Maybe one for a follow-up.On
Pack*
, we'dio.Copy
data from the providedio.Reader
into the file, then write the number of bytes intoCompressedSize
. This allows for efficient serialization, and if you don't read from a particular reader, theunpack-then-pack
cycle is easy.If you do read the values, you have to remember to replace the reader with one that has the content you read, which isn't totally ideal. A rewindable reader of some kind would paper over that, but maybe overcomplicate matters.
What do you think to the idea? Would you be interested in a PR implementing this behaviour, or is it too much of a change to how
struc
operates?The text was updated successfully, but these errors were encountered: