Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate CRC32C using SIMD acceleration #66

Merged
merged 5 commits into from
Nov 20, 2023
Merged

Conversation

lizhanhui
Copy link
Contributor

This pull request contains two sections: one benchmark and modification to replace crate crc with crc32c.

A typical micro-bench run shows the following metrics
image

@lizhanhui lizhanhui mentioned this pull request Nov 16, 2023
@agourlay
Copy link
Member

agourlay commented Nov 17, 2023

@lizhanhui Thank you very much for putting this together 👍

The crc32c crates does not seem to provide hardware acceleration for ARM.
Have you tried benchmarking the current crc crate vs the software fallback of crc32?

I just want to make sure this change will not degrade performance for environment where hardware acceleration is not leveraged.

Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for adding a unit test to show consistency and for adding a benchmark!

LGTM.

@timvisee
Copy link
Member

The crc32c crates does not seem to provide hardware acceleration for ARM.

Good catch! 🙌

@lizhanhui
Copy link
Contributor Author

@lizhanhui Thank you very much for putting this together 👍

The crc32c crates does not seem to provide hardware acceleration for ARM. Have you tried benchmarking the current crc crate version the software fallback of crc32?

I just want to make sure this change will not degrade performance for environment where hardware acceleration is not leveraged.

Not yet, let's add one comparison between crc32c baseline and crc.

@lizhanhui
Copy link
Contributor Author

The crc32c crates does not seem to provide hardware acceleration for ARM.

Hardware acceleration for ARM is guarded by nightly

https://github.com/zowens/crc32c/blob/d70d4e253c4570a073f485d62bc30c4fd64a4b1f/src/lib.rs#L51

@lizhanhui
Copy link
Contributor Author

lizhanhui commented Nov 17, 2023

After forking crate crc32c and introducing features to enable/disable hardware acceleration, I got the following result on a 2015-early Intel Macbook Air. (Note the previous run is performed on a M1 Macbook Pro, AArch64, nightly Rust toolchain used)

without hardware acceleration

cargo bench --no-default-features
image

Conclusion: crc32c is 4X faster than crc using pure software algorithm

with hardware acceleration

cargo bench
image

Conclusion: crc32c is about 10X faster with hardware acceleration

@timvisee @agourlay My Mac at home is kind of old. You can try it on your modern one.

@lizhanhui
Copy link
Contributor Author

As discussed in the issue, there is still some optimization space ahead. Some 5X should be possible.

@agourlay
Copy link
Member

@lizhanhui Excellent work, you convinced me that the change is a net positive for all users.

Please remove the last commit to get back to a published version and I am good for ✔️

@lizhanhui
Copy link
Contributor Author

Thanks for the compliment. The last commit has been reverted.

@lizhanhui
Copy link
Contributor Author

lizhanhui commented Nov 18, 2023

@agourlay Hold on. I think I spot an issue of crate crc when it's initialized with an arbitrary seed value.

@lizhanhui
Copy link
Contributor Author

CRC with custom seed/initial value is somewhat unclear in RFCs. Here is a relevant discussion

Crate 'crc' takes refin = true into account while 'crc32c' does not. I created a pull request to make them behave consistently.

In addition, I added an extra test case to ensure we will get exactly same checksum values.

Eventually, this pull request is completely sound and ready to be merged.

Signed-off-by: lizhanhui <[email protected]>
@timvisee
Copy link
Member

timvisee commented Nov 20, 2023

@lizhanhui Very nice work! Thank you so much for your efforts.

Am I correct that we're currently just waiting on zowens/crc32c#50, after which this can be merged?

@lizhanhui
Copy link
Contributor Author

Am I correct that we're currently just waiting on zowens/crc32c#50, after which this can be merged?

No, we can merge directly. This pull request does not use the API that is inconsistent with crate 'crc' anymore.

@timvisee timvisee requested a review from agourlay November 20, 2023 11:36
Copy link
Member

@agourlay agourlay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@agourlay agourlay merged commit fad0e7c into qdrant:master Nov 20, 2023
2 checks passed
@lizhanhui lizhanhui deleted the crc32c branch November 20, 2023 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants