Skip to content

Benchmarks: Rust

Jennings Zhang edited this page Jul 17, 2023 · 1 revision

The Situation

px-recount was implemented as the first approach to improving the efficiency of how px-repack is called. However, then I realized: the best performance can be achieved by not calling px-repack at all — why don't I just re-write px-repack itself in Rust?

px-recount

Maximum RSS for a typical run of px-recount is about 3520KB. In a typical run, it reads DICOM tags and does a Redis transaction. The memory usage is less than half of what it takes to do literally nothing in Python 3.11.3: /usr/bin/time -v python -c '' reports max RSS to be 9440KB.

rx-repack

rx-repack is a Rust re-write of px-repack. Its per-instance performance is evaluated against the per-series performance of px-repack.

rx-repack hyperfine command

hyperfine --prepare 'rm -rf /tmp/dicom' \
          'px-repack --xcrdir /tmp/anonymized_20230714_184245695 --parseAllFilesWithSubStr , --verbosity 0 --datadir /tmp/dicom' \
"fd --exec rx-repack --xcrdir '{//}' --xcrfile '{/}' --verbosity 0 --datadir /tmp/dicom \; --threads=1 --type f . /tmp/anonymized_20230714_184245695"

rx-repack results

Benchmark 1: px-repack --xcrdir /tmp/anonymized_20230714_184245695 --parseAllFilesWithSubStr , --verbosity 0 --datadir /tmp/dicom
  Time (mean ± σ):      1.708 s ±  0.010 s    [User: 1.808 s, System: 1.242 s]
  Range (min … max):    1.696 s …  1.725 s    10 runs
 
Benchmark 2: fd --exec rx-repack --xcrdir '{//}' --xcrfile '{/}' --verbosity 0 --datadir /tmp/dicom \; --threads=1 --type f . /tmp/anonymized_20230714_184245695
  Time (mean ± σ):     152.3 ms ±   3.1 ms    [User: 115.4 ms, System: 31.8 ms]
  Range (min … max):   148.6 ms … 160.9 ms    19 runs
 
Summary
  fd --exec rx-repack --xcrdir '{//}' --xcrfile '{/}' --verbosity 0 --datadir /tmp/dicom \; --threads=1 --type f . /tmp/anonymized_20230714_184245695 ran
   11.22 ± 0.24 times faster than px-repack --xcrdir /tmp/anonymized_20230714_184245695 --parseAllFilesWithSubStr , --verbosity 0 --datadir /tmp/dicom

rx-repack Resource Consumption

Re-packing everything using rx-repack called per-instance v.s. px-repack per-series:

  • 3 times smaller max RSS i.e. memory usage
  • 16 times faster
  • Better CPU efficiency and lower CPU usage
        Command being timed: "fd --exec rx-repack --xcrdir {//} --xcrfile {/} --verbosity 0 --datadir /tmp/dicom ; --threads=1 --type f . /tmp/anonymized_20230714_184245695"
	User time (seconds): 0.11
	System time (seconds): 0.03
	Percent of CPU this job got: 96%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.14
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 27296
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 41424
	Voluntary context switches: 548
	Involuntary context switches: 17
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0