-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow the data alignment to support zero-copy decoding #248
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #248 +/- ##
==========================================
- Coverage 98.07% 98.00% -0.08%
==========================================
Files 16 16
Lines 1092 1102 +10
Branches 249 251 +2
==========================================
+ Hits 1071 1080 +9
- Misses 21 22 +1 ☔ View full report in Codecov by Sentry. |
It sounds perfect for me, for the reasons as you mentioned in the description. I'd like to merge this PR, so can you add the benchmark and docs?:
|
4976850
to
46af895
Compare
Hi @gfx ./sample-large.json: (7598 bytes in JSON)
Generated large Float32Array: (24906 bytes in JSON) Not supported by @msgpack/msgpack
|
Sounds good. Plus, can you compare with the folowing conditions?:
1 is the baseline. 2 should be much faster than 1. It's awesome if 3 is faster than 2. |
I've added a sample of the array of ./sample-large.json: (7598 bytes in JSON)
Large array of numbers: (5016 bytes in JSON)
Large Float32Array: (24906 bytes in JSON)
|
Awesome!! 👏 Thank you for the perfect report. The zero-copy extension is significantly effective and everyone who uses a large Float32Array (and probably Float64Array as well) should know this pull-request! LGTM. Will release 3.0.0-beta-next today, and will release 3.0.0 in a few weeks. |
cc: @sergeyzenchenko you'll be intrested in this pull-request as well. |
An alternative solution for the data alignment proposed in that PR
In that proposal, the
encode
function of the extension codec could return a function that will be called byEncoder
at the moment of building the final buffer. The function receives the position of the data in the buffer. The encoder expects to receive data that will be then added to the buffer at the provided position. Users can add the padding bytes straight into the data at the encoding phase and calculate the data offset at the decoding phase depending on the chosen padding+data format.That solution has some advantages over the one mentioned above.
The drawback is that to resolve the ambiguity of the final size of the data the size field always takes 4 bytes, independent of how big the actual data is. That could be highlighted in the documentation to warn users. That is only a deal in case of returning a function instead of the actual data from the
encode
function, so the existing users won't be affected.