Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the data alignment to support zero-copy decoding #248

Merged
merged 2 commits into from
Feb 4, 2025

Conversation

EddiG
Copy link
Contributor

@EddiG EddiG commented Jan 27, 2025

An alternative solution for the data alignment proposed in that PR

In that proposal, the encode function of the extension codec could return a function that will be called by Encoder at the moment of building the final buffer. The function receives the position of the data in the buffer. The encoder expects to receive data that will be then added to the buffer at the provided position. Users can add the padding bytes straight into the data at the encoding phase and calculate the data offset at the decoding phase depending on the chosen padding+data format.

That solution has some advantages over the one mentioned above.

  • It doesn't change the message pack format. The padding "lives" in the data part.
  • It allows a user to decide how to align the data.
  • It doesn't break the compatibility with the existing formats.

The drawback is that to resolve the ambiguity of the final size of the data the size field always takes 4 bytes, independent of how big the actual data is. That could be highlighted in the documentation to warn users. That is only a deal in case of returning a function instead of the actual data from the encode function, so the existing users won't be affected.

@codecov-commenter
Copy link

codecov-commenter commented Jan 29, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.00%. Comparing base (3b6ef80) to head (1f9e335).
Report is 30 commits behind head on main.

Files with missing lines Patch % Lines
src/Encoder.ts 90.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #248      +/-   ##
==========================================
- Coverage   98.07%   98.00%   -0.08%     
==========================================
  Files          16       16              
  Lines        1092     1102      +10     
  Branches      249      251       +2     
==========================================
+ Hits         1071     1080       +9     
- Misses         21       22       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gfx
Copy link
Member

gfx commented Jan 29, 2025

It sounds perfect for me, for the reasons as you mentioned in the description.

I'd like to merge this PR, so can you add the benchmark and docs?:

  • Benchmark: would be nice if this feature actually boosts performance. If the effect is significant, the authors of other MessagePack implementations should be fascinated by it.
  • Docs: please update README.md for this option, and would be nice if it has example for the usage.

@EddiG EddiG force-pushed the alignment-fn branch 2 times, most recently from 4976850 to 46af895 Compare February 3, 2025 02:44
@EddiG
Copy link
Contributor Author

EddiG commented Feb 3, 2025

Hi @gfx
I've updated the benchmark script to run it for that particular case and it shows that the performance boost is quite significant.

./sample-large.json: (7598 bytes in JSON)

(encode) @msgpack/msgpack x 33,075 ops/sec ±0.23% (99 runs sampled)
(encode) @msgpack/msgpack (zero-copy extension) x 32,725 ops/sec ±0.29% (94 runs sampled)
(encode) msgpack-lite x 29,127 ops/sec ±0.24% (96 runs sampled)
(encode) notepack.io x 31,991 ops/sec ±0.26% (100 runs sampled)
(decode) @msgpack/msgpack x 33,384 ops/sec ±0.20% (102 runs sampled)
(decode) @msgpack/msgpack (zero-copy extension) x 33,093 ops/sec ±0.38% (97 runs sampled)
(decode) msgpack-lite x 15,805 ops/sec ±0.14% (100 runs sampled)
(decode) notepack.io x 23,902 ops/sec ±0.13% (97 runs sampled)

Generated large Float32Array: (24906 bytes in JSON)

Not supported by @msgpack/msgpack

(encode) @msgpack/msgpack (zero-copy extension) x 541,817 ops/sec ±0.66% (95 runs sampled)
(encode) msgpack-lite x 503,609 ops/sec ±0.58% (97 runs sampled)
(encode) notepack.io x 14,119 ops/sec ±1.18% (96 runs sampled)
(decode) @msgpack/msgpack (zero-copy extension) x 4,211,620 ops/sec ±0.15% (101 runs sampled)
(decode) msgpack-lite x 1,170,542 ops/sec ±1.11% (97 runs sampled)
(decode) notepack.io x 25,718 ops/sec ±0.22% (98 runs sampled)

@gfx
Copy link
Member

gfx commented Feb 4, 2025

Sounds good. Plus, can you compare with the folowing conditions?:

  1. MessagePack with no extension for an array of number (not typed arrays)
  2. with an extension that does not align for Float64Array
  3. with an extension that does align for Float64Array (this PR)

1 is the baseline. 2 should be much faster than 1. It's awesome if 3 is faster than 2.

@EddiG
Copy link
Contributor Author

EddiG commented Feb 4, 2025

I've added a sample of the array of number. Also, I've added an implementation with the extension that adds support for the Float32Array by copying.
You can notice that the implementations with the extensions almost don't affect the result in the case of the array of number. That's expected as they do not participate in the process.
In the case of the Float32Array sample, the encode function of the implementation with zero-copy takes more time as it requires extra calculation. However, the decode function is much more performant as it doesn't need the allocation of additional memory and the copying of data.
I didn't use the Float64Array as the Float32Array was one that I needed from the beginning. I also believe the result will be similar.

./sample-large.json: (7598 bytes in JSON)

(encode) @msgpack/msgpack x 33,101 ops/sec ±0.28% (96 runs sampled)
(encode) @msgpack/msgpack (Float32Array extension) x 32,655 ops/sec ±0.43% (96 runs sampled)
(encode) @msgpack/msgpack (Float32Array with zero-copy extension) x 32,911 ops/sec ±0.27% (99 runs sampled)
(encode) msgpack-lite x 29,584 ops/sec ±0.50% (95 runs sampled)
(encode) notepack.io x 32,052 ops/sec ±0.15% (101 runs sampled)
(decode) @msgpack/msgpack x 33,135 ops/sec ±0.19% (101 runs sampled)
(decode) @msgpack/msgpack (Float32Array extension) x 32,726 ops/sec ±0.38% (98 runs sampled)
(decode) @msgpack/msgpack (Float32Array with zero-copy extension) x 32,965 ops/sec ±0.21% (101 runs sampled)
(decode) msgpack-lite x 15,844 ops/sec ±0.66% (98 runs sampled)
(decode) notepack.io x 23,744 ops/sec ±0.17% (98 runs sampled)

Large array of numbers: (5016 bytes in JSON)

(encode) @msgpack/msgpack x 64,782 ops/sec ±0.30% (98 runs sampled)
(encode) @msgpack/msgpack (Float32Array extension) x 64,592 ops/sec ±0.21% (101 runs sampled)
(encode) @msgpack/msgpack (Float32Array with zero-copy extension) x 64,321 ops/sec ±0.44% (93 runs sampled)
(encode) msgpack-lite x 46,943 ops/sec ±0.29% (98 runs sampled)
(encode) notepack.io x 49,402 ops/sec ±0.15% (98 runs sampled)
(decode) @msgpack/msgpack x 135,173 ops/sec ±0.22% (98 runs sampled)
(decode) @msgpack/msgpack (Float32Array extension) x 135,001 ops/sec ±0.23% (98 runs sampled)
(decode) @msgpack/msgpack (Float32Array with zero-copy extension) x 134,140 ops/sec ±0.26% (99 runs sampled)
(decode) msgpack-lite x 9,254 ops/sec ±0.12% (99 runs sampled)
(decode) notepack.io x 152,005 ops/sec ±0.26% (98 runs sampled)

Large Float32Array: (24906 bytes in JSON)

Not supported by @msgpack/msgpack
(encode) @msgpack/msgpack (Float32Array extension) x 766,230 ops/sec ±0.85% (96 runs sampled)
(encode) @msgpack/msgpack (Float32Array with zero-copy extension) x 558,411 ops/sec ±1.08% (96 runs sampled)
(encode) msgpack-lite x 525,827 ops/sec ±0.40% (98 runs sampled)
(encode) notepack.io x 14,122 ops/sec ±1.32% (95 runs sampled)
(decode) @msgpack/msgpack (Float32Array extension) x 1,445,303 ops/sec ±0.45% (98 runs sampled)
(decode) @msgpack/msgpack (Float32Array with zero-copy extension) x 3,873,735 ops/sec ±0.38% (99 runs sampled)
(decode) msgpack-lite x 1,198,178 ops/sec ±0.60% (98 runs sampled)
(decode) notepack.io x 25,371 ops/sec ±0.17% (97 runs sampled)

@gfx
Copy link
Member

gfx commented Feb 4, 2025

Awesome!! 👏

Thank you for the perfect report. The zero-copy extension is significantly effective and everyone who uses a large Float32Array (and probably Float64Array as well) should know this pull-request!

LGTM. Will release 3.0.0-beta-next today, and will release 3.0.0 in a few weeks.

@gfx gfx merged commit eec47c4 into msgpack:main Feb 4, 2025
10 checks passed
@gfx
Copy link
Member

gfx commented Feb 4, 2025

cc: @sergeyzenchenko you'll be intrested in this pull-request as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants