Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make --no-mmap calls still use parallelism when filesizes are large #361

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ultrabear
Copy link

This change uses double buffers that are each 1MiB large, while one buffer is filling from the OS, the other buffer is hashed using update_rayon. This is around twice as fast as just using update_reader for files of 1GiB in size on my machine (ryzen 2600), and half as fast as using mmap.

The code also accounts for small files, if a file is under 1MiB it will fall back to update_reader, this ensures that the change is always at least neutral in performance, because we overshot the actual place where update_rayon becomes faster, we never see cases where it is slower.

Currently the code uses the read_chunks crate, which is something I made to handle EINTR and try and fully fill the read buffer, if this is approved to merge I would want to take the function it calls and just cut it into this project somewhere, instead of adding an extra dependency.

Some crude benchmarks below, hashing a gibibyte of random data;
(b3sum 1.5.0 vs 03e0949)

# this PR
[b3sum]$ time ./target/release/b3sum --no-mmap gigafile
303966b0ba3c0766247f911d8f7dd172cffa1952bf1106f801fcf7e1455ce5c0  gigafile

real	0m0.253s
user	0m1.234s
sys	0m0.501s
# unmodified binary
[b3sum]$ time b3sum --no-mmap gigafile
303966b0ba3c0766247f911d8f7dd172cffa1952bf1106f801fcf7e1455ce5c0  gigafile

real	0m0.570s
user	0m0.477s
sys	0m0.091s
# unmodified binary, with mmap enabled
[b3sum]$ time b3sum gigafile
303966b0ba3c0766247f911d8f7dd172cffa1952bf1106f801fcf7e1455ce5c0  gigafile

real	0m0.126s
user	0m1.067s
sys	0m0.103s

This uses a double buffer of 1MiB each, reading to one buffer while
hashing the other in parallel. This is around 2x as fast as hashing
singlethreadedly on my machine (ryzen 2600) with an in memory benchmark.

This is still 2x slower than using memmap.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant