Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: improve read / write performance #482

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

aoblet
Copy link

@aoblet aoblet commented May 9, 2023

Context
On fast storage, reading with a specific block size can drastically improve fetch performance.
Default value is 256K. On scale-out or flash disk storage, this value can be set higher than 1MB.

Update
This PR adds two things:

  • max-map-size
    This option acts as count argument in read(int fd, void *buf, size_t count)
  • write-size
    This option forces the buffer size when writing a file on the receiver.

Result
Tests on my side resulted on rsync whole-file, 5x time faster with the following specs:

time /opt/rsync/bin/rsync --progress -avH  --whole-file --max-map-size=$((4*1024*1024)) --write-size=$((512*1024)) *.zip --rsync-path /opt/rsync/bin/rsync -e "/opt/hpnssh/bin/ssh -c [email protected]" dst:/path/to/sync/
  • filesystem: rozofs / zfs
  • number of disks 60
  • max-map-size 4MB
  • network 10Gb link / latency 0.8ms
  • hpnssh / 4800 MTU
DATASET
2x 1,3 GB zip files

Total: 2,6 GB
CONFIG TIME MB/S DD LOCAL READ_BLOCK_SIZE IO_BUFFER_SIZE MTU SSH HPNSSH DAEMON RSYNC RATIO PERF
dd local 3,3 758 Yes 4096 No No No No No 0
                     
rsync native 24,2 106 No 256 32 1500 Yes No No 1
rsync + hpn 22,4 114 No 256 32 1500 No Yes No 1,08
rsync +
hpn aes128 +
mtu 4800 end to end +
read block size 1024
10,9 235 No 1024 32 4800 No Yes No 2,22
rsync daemon 7,6 336 No 1024 32 4800 No No Yes 3,17
rsync +
hpn aes128 +
mtu 4800 end to end +
read block size 4096
4,9 522 No 4096 1024 4800 No Yes No 4,93

@realsimix
Copy link

Hi,

This looks interesting, just a cosmetic thing: maybe the manpage should be changed like so?

--one-file-system, -x    don't cross filesystem boundaries
--block-size=SIZE, -B    force a fixed checksum block-size
--max-map-size=SIZE      force mmap read block size (expressed in bytes, useful for fast storage, default 256K)
--write-size=SIZE        force write block size (expressed in bytes, default 32K)
--rsh=COMMAND, -e        specify the remote shell to use
--rsync-path=PROGRAM     specify the rsync to run on remote machine

When using this patch but without specifying the new options, will rsync run with the same parameters as when running without this patch?

Thanks,
Simon

@aoblet
Copy link
Author

aoblet commented Oct 16, 2023

@realsimix sure thing see 8dd0b8e.

Yes options are backward compatible.
Rsync original behavior is respected if options are not set.

@aoblet
Copy link
Author

aoblet commented Oct 25, 2023

@WayneD ?

aoblet and others added 3 commits April 7, 2024 08:02
On fast storage, reading with a specific block size can drastically
improve fetch performance. This option acts as length mmap argument.

On scale-out or flash disk storage, this value can be set higher than
1MB.

Tests on my side resulted on rsync whole-file, 5x time faster with the
follwowing specs:
  - max-map-size 4MB
  - network 10Gb link
  - hpnssh / 4800 MTU
This option forces the buffer size when writing a file on the receiver
side.
@@ -2772,7 +2776,19 @@ void server_options(char **args, int *argc_p)
args[ac++] = arg;
}

if (io_timeout) {
if (max_map_size) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we will need a MAX_MAP_SIZE_DEFAULT define, and make this check be if (max_map_size != MAX_MAP_SIZE_DEFAULT)
otherwise we will be sending this option always, and the remove rsync may not understand it, so would break a lot of setups

args[ac++] = arg;
}

if (write_size) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue as max_map_size

@@ -472,6 +472,8 @@ has its own detailed description later in this manpage.
--checksum-choice=STR choose the checksum algorithm (aka --cc)
--one-file-system, -x don't cross filesystem boundaries
--block-size=SIZE, -B force a fixed checksum block-size
--max-map-size=SIZE force mmap read block size (expressed in bytes, useful for fast storage, default 256K)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be easier to use in kBytes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants