Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binary serialization and zstd compression for Eigen types [OC-293] #451

Merged
merged 19 commits into from
Oct 4, 2023

Conversation

peddie
Copy link
Contributor

@peddie peddie commented Aug 28, 2023

Currently albatross serializes large vectors and matrices by

  • compressing the binary data with gzip
  • base64-encoding the compressed data
  • serializing to JSON

In a head-to-head test on systems internal to Swift, we were able to reduce serialization time to about 0.4 of the current approach by using Cereal's portable binary archive type and compressing with zstd. This patch switches Albatross to use the latter approach.

WORKSPACE.bazel Outdated Show resolved Hide resolved
@peddie peddie changed the title zstd compression for Eigen types zstd compression for Eigen types [OC-293] Aug 30, 2023
compressed.resize(compressed_bound);

const std::size_t compressed_size =
// I solemnly swear that I am up to no good.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe describe the situation you think could cause a failure here and/or why it won't happen, in place of this and the below comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol, I'll remove those.


TEST(Compress, DecompressEmpty) {
std::string inputs;
ASSERT_DEATH({ const auto result = albatross::zstd::decompress(inputs); },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we would want to assert on? One recent issue that has come up in the AQ, the current solution is to return an empty model instead of asserting. Could this cause issues on the receiving end? @jbangelo
https://github.com/swift-nav/orion-engine/pull/6939

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we have a better way to handle failures here. These routines get used in the cereal serialisation / deserialisation routines, and I'm not aware of a way to propagate deserialisation failures via the cereal API.

The code this is replacing also asserted on any detected decompression failure, so for this change I chose to preserve that behavior.

Copy link
Contributor

@john-michaelburke john-michaelburke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a description to the PR and take a look at my comments. Otherwise LGTM!

Copy link
Collaborator

@akleeman akleeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@akleeman akleeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, nit: the title mostly says it, but the PR is missing a description (why the switch to zstd?)

@peddie peddie changed the title zstd compression for Eigen types [OC-293] binary serialization and zstd compression for Eigen types [OC-293] Oct 2, 2023
@peddie peddie merged commit b0b7da6 into master Oct 4, 2023
9 checks passed
@peddie peddie deleted the peddie/zstd branch October 4, 2023 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants