Allow the data alignment to support zero-copy decoding #248

EddiG · 2025-01-27T19:54:48Z

An alternative solution for the data alignment proposed in that PR

In that proposal, the encode function of the extension codec could return a function that will be called by Encoder at the moment of building the final buffer. The function receives the position of the data in the buffer. The encoder expects to receive data that will be then added to the buffer at the provided position. Users can add the padding bytes straight into the data at the encoding phase and calculate the data offset at the decoding phase depending on the chosen padding+data format.

That solution has some advantages over the one mentioned above.

It doesn't change the message pack format. The padding "lives" in the data part.
It allows a user to decide how to align the data.
It doesn't break the compatibility with the existing formats.

The drawback is that to resolve the ambiguity of the final size of the data the size field always takes 4 bytes, independent of how big the actual data is. That could be highlighted in the documentation to warn users. That is only a deal in case of returning a function instead of the actual data from the encode function, so the existing users won't be affected.

codecov-commenter · 2025-01-29T12:53:50Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.00%. Comparing base (3b6ef80) to head (1f9e335).
Report is 30 commits behind head on main.

Files with missing lines	Patch %	Lines
src/Encoder.ts	90.00%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #248      +/-   ##
==========================================
- Coverage   98.07%   98.00%   -0.08%     
==========================================
  Files          16       16              
  Lines        1092     1102      +10     
  Branches      249      251       +2     
==========================================
+ Hits         1071     1080       +9     
- Misses         21       22       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

gfx · 2025-01-29T12:58:04Z

It sounds perfect for me, for the reasons as you mentioned in the description.

I'd like to merge this PR, so can you add the benchmark and docs?:

Benchmark: would be nice if this feature actually boosts performance. If the effect is significant, the authors of other MessagePack implementations should be fascinated by it.
Docs: please update README.md for this option, and would be nice if it has example for the usage.

EddiG · 2025-02-03T02:54:44Z

Hi @gfx
I've updated the benchmark script to run it for that particular case and it shows that the performance boost is quite significant.

./sample-large.json: (7598 bytes in JSON)

(encode) @msgpack/msgpack x 33,075 ops/sec ±0.23% (99 runs sampled)
(encode) @msgpack/msgpack (zero-copy extension) x 32,725 ops/sec ±0.29% (94 runs sampled)
(encode) msgpack-lite x 29,127 ops/sec ±0.24% (96 runs sampled)
(encode) notepack.io x 31,991 ops/sec ±0.26% (100 runs sampled)

(decode) @msgpack/msgpack x 33,384 ops/sec ±0.20% (102 runs sampled)
(decode) @msgpack/msgpack (zero-copy extension) x 33,093 ops/sec ±0.38% (97 runs sampled)
(decode) msgpack-lite x 15,805 ops/sec ±0.14% (100 runs sampled)
(decode) notepack.io x 23,902 ops/sec ±0.13% (97 runs sampled)

Generated large Float32Array: (24906 bytes in JSON)

Not supported by @msgpack/msgpack

(encode) @msgpack/msgpack (zero-copy extension) x 541,817 ops/sec ±0.66% (95 runs sampled)
(encode) msgpack-lite x 503,609 ops/sec ±0.58% (97 runs sampled)
(encode) notepack.io x 14,119 ops/sec ±1.18% (96 runs sampled)

(decode) @msgpack/msgpack (zero-copy extension) x 4,211,620 ops/sec ±0.15% (101 runs sampled)
(decode) msgpack-lite x 1,170,542 ops/sec ±1.11% (97 runs sampled)
(decode) notepack.io x 25,718 ops/sec ±0.22% (98 runs sampled)

gfx · 2025-02-04T00:02:06Z

Sounds good. Plus, can you compare with the folowing conditions?:

MessagePack with no extension for an array of number (not typed arrays)
with an extension that does not align for Float64Array
with an extension that does align for Float64Array (this PR)

1 is the baseline. 2 should be much faster than 1. It's awesome if 3 is faster than 2.

EddiG · 2025-02-04T01:42:56Z

I've added a sample of the array of number. Also, I've added an implementation with the extension that adds support for the Float32Array by copying.
You can notice that the implementations with the extensions almost don't affect the result in the case of the array of number. That's expected as they do not participate in the process.
In the case of the Float32Array sample, the encode function of the implementation with zero-copy takes more time as it requires extra calculation. However, the decode function is much more performant as it doesn't need the allocation of additional memory and the copying of data.
I didn't use the Float64Array as the Float32Array was one that I needed from the beginning. I also believe the result will be similar.

./sample-large.json: (7598 bytes in JSON)

(encode) @msgpack/msgpack x 33,101 ops/sec ±0.28% (96 runs sampled)
(encode) @msgpack/msgpack (Float32Array extension) x 32,655 ops/sec ±0.43% (96 runs sampled)
(encode) @msgpack/msgpack (Float32Array with zero-copy extension) x 32,911 ops/sec ±0.27% (99 runs sampled)
(encode) msgpack-lite x 29,584 ops/sec ±0.50% (95 runs sampled)
(encode) notepack.io x 32,052 ops/sec ±0.15% (101 runs sampled)

(decode) @msgpack/msgpack x 33,135 ops/sec ±0.19% (101 runs sampled)
(decode) @msgpack/msgpack (Float32Array extension) x 32,726 ops/sec ±0.38% (98 runs sampled)
(decode) @msgpack/msgpack (Float32Array with zero-copy extension) x 32,965 ops/sec ±0.21% (101 runs sampled)
(decode) msgpack-lite x 15,844 ops/sec ±0.66% (98 runs sampled)
(decode) notepack.io x 23,744 ops/sec ±0.17% (98 runs sampled)

Large array of numbers: (5016 bytes in JSON)

(encode) @msgpack/msgpack x 64,782 ops/sec ±0.30% (98 runs sampled)
(encode) @msgpack/msgpack (Float32Array extension) x 64,592 ops/sec ±0.21% (101 runs sampled)
(encode) @msgpack/msgpack (Float32Array with zero-copy extension) x 64,321 ops/sec ±0.44% (93 runs sampled)
(encode) msgpack-lite x 46,943 ops/sec ±0.29% (98 runs sampled)
(encode) notepack.io x 49,402 ops/sec ±0.15% (98 runs sampled)

(decode) @msgpack/msgpack x 135,173 ops/sec ±0.22% (98 runs sampled)
(decode) @msgpack/msgpack (Float32Array extension) x 135,001 ops/sec ±0.23% (98 runs sampled)
(decode) @msgpack/msgpack (Float32Array with zero-copy extension) x 134,140 ops/sec ±0.26% (99 runs sampled)
(decode) msgpack-lite x 9,254 ops/sec ±0.12% (99 runs sampled)
(decode) notepack.io x 152,005 ops/sec ±0.26% (98 runs sampled)

Large Float32Array: (24906 bytes in JSON)

Not supported by @msgpack/msgpack

(encode) @msgpack/msgpack (Float32Array extension) x 766,230 ops/sec ±0.85% (96 runs sampled)
(encode) @msgpack/msgpack (Float32Array with zero-copy extension) x 558,411 ops/sec ±1.08% (96 runs sampled)
(encode) msgpack-lite x 525,827 ops/sec ±0.40% (98 runs sampled)
(encode) notepack.io x 14,122 ops/sec ±1.32% (95 runs sampled)

(decode) @msgpack/msgpack (Float32Array extension) x 1,445,303 ops/sec ±0.45% (98 runs sampled)
(decode) @msgpack/msgpack (Float32Array with zero-copy extension) x 3,873,735 ops/sec ±0.38% (99 runs sampled)
(decode) msgpack-lite x 1,198,178 ops/sec ±0.60% (98 runs sampled)
(decode) notepack.io x 25,371 ops/sec ±0.17% (97 runs sampled)

gfx · 2025-02-04T04:23:50Z

Awesome!! 👏

Thank you for the perfect report. The zero-copy extension is significantly effective and everyone who uses a large Float32Array (and probably Float64Array as well) should know this pull-request!

LGTM. Will release 3.0.0-beta-next today, and will release 3.0.0 in a few weeks.

gfx · 2025-02-04T04:25:06Z

cc: @sergeyzenchenko you'll be intrested in this pull-request as well.

EddiG mentioned this pull request Jan 27, 2025

Implement Alignment Support for Custom Extensions in Encoder/Decoder #245

Closed

EddiG force-pushed the alignment-fn branch from addb8fc to 90dab45 Compare February 3, 2025 02:37

Allow the data alignment to support zero-copy decoding

fbcd36b

EddiG force-pushed the alignment-fn branch 2 times, most recently from 4976850 to 46af895 Compare February 3, 2025 02:44

Add a benchmark for TypedArray with zero-copy extension

1f9e335

EddiG force-pushed the alignment-fn branch from 46af895 to 1f9e335 Compare February 4, 2025 01:25

gfx merged commit eec47c4 into msgpack:main Feb 4, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow the data alignment to support zero-copy decoding #248

Allow the data alignment to support zero-copy decoding #248

EddiG commented Jan 27, 2025 •

edited

Loading

codecov-commenter commented Jan 29, 2025 •

edited

Loading

gfx commented Jan 29, 2025 •

edited

Loading

EddiG commented Feb 3, 2025

gfx commented Feb 4, 2025 •

edited

Loading

EddiG commented Feb 4, 2025

gfx commented Feb 4, 2025 •

edited

Loading

gfx commented Feb 4, 2025

Allow the data alignment to support zero-copy decoding #248

Allow the data alignment to support zero-copy decoding #248

Conversation

EddiG commented Jan 27, 2025 • edited Loading

codecov-commenter commented Jan 29, 2025 • edited Loading

Codecov Report

gfx commented Jan 29, 2025 • edited Loading

EddiG commented Feb 3, 2025

gfx commented Feb 4, 2025 • edited Loading

EddiG commented Feb 4, 2025

gfx commented Feb 4, 2025 • edited Loading

gfx commented Feb 4, 2025

EddiG commented Jan 27, 2025 •

edited

Loading

codecov-commenter commented Jan 29, 2025 •

edited

Loading

gfx commented Jan 29, 2025 •

edited

Loading

gfx commented Feb 4, 2025 •

edited

Loading

gfx commented Feb 4, 2025 •

edited

Loading