Changes

sosy-lab · May 27, 2024 · 3833b15 · 3833b15
1 parent 9e89fd8
commit 3833b15
Show file tree

Hide file tree

Showing 2 changed files with 69 additions and 48 deletions.
diff --git a/README.md b/README.md
@@ -16,10 +16,6 @@ SPDX-License-Identifier: Apache-2.0
 [![PyPI version](https://img.shields.io/pypi/v/BenchExec.svg)](https://pypi.python.org/pypi/BenchExec)
 [![DOI](https://zenodo.org/badge/30758422.svg)](https://zenodo.org/badge/latestdoi/30758422)
 
-> [!NOTE]
-> To get started with reliably benchmarking right away, follow the
-> [quickstart guide](doc/quickstart.md).
-
 **News and Updates**:
 - Two projects accepted for BenchExec as part of [Google Summer of Code](https://summerofcode.withgoogle.com/)!
   We are happy that [Haoran Yang](https://summerofcode.withgoogle.com/programs/2024/projects/UzhlnEel)
@@ -33,6 +29,11 @@ SPDX-License-Identifier: Apache-2.0
   you can read [Reliable Benchmarking: Requirements and Solutions](https://doi.org/10.1007/s10009-017-0469-y) online.
   We also provide a set of [overview slides](https://www.sosy-lab.org/research/prs/Latest_ReliableBenchmarking.pdf).
 
+> To help new or inexperienced users get started with reliable benchmarking
+> right away, we offer a [quickstart guide](doc/quickstart.md) that contains
+> a brief explanation of the issues of "standard" setups as well as the (few)
+> steps necessary to setup and use BenchExec instead.
+
 BenchExec is a framework for reliable benchmarking and resource measurement
 and provides a standalone solution for benchmarking
 that takes care of important low-level details for accurate, precise, and reproducible measurements

diff --git a/doc/quickstart.md b/doc/quickstart.md
@@ -1,40 +1,48 @@
-# A Quickstart Guide to Proper Benchmarking with BenchExec
+# A Beginner's Guide to Reliable Benchmarking
 
-This guide provides a brief summary of instructions to set up reliable
-benchmark measurements using BenchExec and important points to consider. It is
-meant for users who either want to set up benchmarking from scratch or already
-have a simple setup using, e.g., `time`, `timeout`, `taskset`, `ulimit`, etc.
-and will guide you how to use `runexec` as a simple but much more reliable
-"drop-in" replacement for these tools.
-
-## Guiding Example
-
-> [!IMPORTANT]
 > If your current setup looks similar to the below example (or you are thinking
 > about such a setup), we strongly recommend following this guide for a much
 > more reliable process.
 
-As an example, suppose that you want to measure the performance of your
-tool `program` with arguments `--foo` and `--bar` on the input files
-`input_1.in` to `input_9.in`. To measure the runtime of the tool, one may run
-```
-$ /usr/bin/time program --foo input_1.in
-$ /usr/bin/time program --bar input_1.in
-$ /usr/bin/time program --foo input_2.in
-...
-```
-etc. and note the results. In case resource limitations are desired (e.g. 
-limiting to 1 CPU and 60 sec of wallclock time), the calls might be
-```
-$ taskset -c 0 timeout 60s /usr/bin/time program ...
-```
-or similar.
+## Audience
 
-## Benchmarking with BenchExec
-
-The following steps guide you to increase the reliability and quality of
-measurements drastically by using BenchExec instead of these small
-utilities.
+This guide provides a brief summary of instructions to set up reliable
+benchmark measurements using BenchExec and important points to consider. It is
+meant for users who either want to set up benchmarking for a small number of
+runs from scratch or already have a simple setup using, e.g., `time`,
+`timeout`, `taskset`, `ulimit`, etc. Concretely, this guide will show you how
+to use `runexec` as a simple but much more reliable "drop-in" replacement for
+these tools. If you want to benchmark large number of runs or get the most out
+of what BenchExec provides as a benchmarking framework, consider using the tool
+`benchexec` instead (further details below).
+
+## Why Should I use BenchExec?
+
+As a simple example, suppose that you want to measure the performance of your
+newly implemented tool `program` with arguments `--foo` and `--bar` on the
+input files `input_1.in` to `input_9.in`. To measure the runtime of the tool,
+you might run `$ /usr/bin/time program --foo input_1.in` etc. and note the
+results. In case resource limitations are desired (e.g. limiting to 1 CPU and
+60 sec of wallclock time), the calls might be
+`$ taskset -c 0 timeout 60s /usr/bin/time program ...` or similar.
+
+While useful, these utilities (i.e. `time`, `ulimit`, etc.) unfortunately are
+not suitable for reliable benchmarking, especially when parallelism or
+sub-processes are involved and may give you *completely wrong* results.
+BenchExec takes care of most of the problems for you, which is why we recommend
+using it instead. By following this guide, you thus significantly increase the
+reliability of your results without much effort.
+
+For further details and insights into peculiarities and pitfalls of reliable
+benchmarking (as well as how BenchExec is mitigating them where possible), we
+recommend the
+[overview slides](https://www.sosy-lab.org/research/prs/Latest_ReliableBenchmarking.pdf)
+and [the corresponding paper](https://doi.org/10.1007/s10009-017-0469-y).
+
+## Reliable Benchmarking with BenchExec
+
+The following steps show you how to increase the reliability and quality of
+measurements by using BenchExec instead of the standard system utilities.
 
 ### Step 1. Install BenchExec
 
@@ -60,24 +68,36 @@ think about which executions you want to measure, what resource limits should
 be placed on the benchmarked tool(s), such as CPU time, CPU count, memory, etc.
 Also consider how timeouts should be treated.
 
-For more complicated setups, please also refer to the
-[benchmarking setup guide](benchmarking.md). For example, in case you want to
-execute multiple benchmarks in parallel, think about how to deal with shared
-resources (e.g. the memory bus and CPU cache). For such cases, we recommend
-using `benchexec` instead, which takes care of managing parallel invocations.
-
+Independently of using BenchExec, we strongly recommend the following the
+guidelines of the [benchmarking guide](benchmarking.md).
 
 ### Step 3. Gather Measurements using runexec
 
-Using the example from above, suppose that we want to limit the process to 60s
-wall time, 1 GB of memory, and cpu core 0. Then, simply run
+Using the example from above, suppose that we want to measure `program` on
+input `input_1.in`. Then, simply run
 ```
-$ runexec --quiet --walltimelimit 60s --memlimit 1GB --cores 0 --output output_1_foo.log -- \
-  program --foo input_1.in
+$ runexec --output output_1_foo.log -- program --foo input_1.in
 ```
-This executes `program --foo input_1.in` and prints measurements to standard
-output, such as walltime, cputime, memory, I/O, etc. in a simple to read and
-parse format. The output of program is redirected to `output_1_foo.log`.
+This executes `program --foo input_1.in`, redirecting output to
+`output_1_foo.log`. Then `runexec` prints relevant measurements, such as
+walltime, cputime, memory, I/O, etc., to standard output in a simple to read
+and parse format, for example:
+```
+starttime=2000-01-01T00:01:01.000001+00:00
+returnvalue=0
+walltime=0.0027799380040960386s
+cputime=0.002098s
+memory=360448B
+pressure-cpu-some=0s
+pressure-io-some=0s
+pressure-memory-some=0s
+```
+See the [run result](run-results.md) documentation for further details on the
+precise meaning of these values.
+
+In case you want to limit the process to 60s wall time, 1 GB of memory, and one CPU core (by
+[pinning](https://en.wikipedia.org/wiki/Processor_affinity) the process to CPU
+number 0), simply run `$ runexec --walltimelimit 60s --memlimit 1GB --cores 0 ...` instead.
 
 The tool `runexec` offers several other features, run `runexec --help` for
 further information or refer to the [documentation](runexec.md).