-
Notifications
You must be signed in to change notification settings - Fork 1
Using `main.py` and creating benchmarks
In this section I am going to explain how you can run tests, plot the results and how you can plot histograms using different approaches with the GPU and the CPU. If you would like to use gpu_hist.py
with your own software then please follow this link.
In order to run a benchmark test you may use
python main.py --tests --outdir dir
dir
refers to the folder you want the benchmark results to be saved. Several plots will be created.
Each plot consists of four diagrams where single precision (left side) or double precision (right side) is used and where edges are given (top row) or where edges need to be calculated beforehand (bottom row).
The plots have following names:
n_dims_d_n_bins_b_with-device-samples_speedup_test
and
n_dims_d_n_bins_b_speedup_test
d
indicates the number of dimensions of the elements used, b
is the number of bins used (please note that the number of flat bins is b**d
), and if with-device-samples
is in the name then the samples are allocated before the timings are done. This is useful if your data is already on the GPU thus avoiding the data transfer.
Example with an i5 2400k and a GTX 960. (from an old version)
You have the following options:
-
--full
: Full test with comparison of numpy's histogramdd and GPU code with single and double precision and the GPU code with shared and global memory. -
--GPU_shared
: Use GPU code with shared memory. If --GPU_both is set, then --GPU_shared will be ignored. -
--GPU_global
: Use GPU code with global memory. If --GPU_both is set, then --GPU_global will be ignored. -
--GPU_both
: Use GPU code with shared memory and global memory and compare both. -
--CPU
: Use numpy's histogramdd. -
--all_precisions
: Run all specified tests with double and single precision. -
-s
,--single_precision
: Use single precision. If it is not set, use double precision. If --all_precisions is used, then -s will be ignored. -
-d
,--data
: Define the number of elements in each dimension for the input data. -
--device_data
: Use device arrays as input data. -
--dimension
: Define the number of dimensions for the input data and the histogram. -
-b
,--bins
: Choose the number of bins for each dimension -
-w
,--weights
: (Randomized) weights will be used on the histogram. -
--use_given_edges
: Use calculated edges instead of calculating edges during histogramming. -
--use_irregular_edges
: The number of edges varies with number of bins/2 for each dimension. The mean should be at least 6 bins for each dimension. -
--outdir
: Store all output plots to this directory. If they don't exist, the script will make them, including all subdirectories. If none is supplied no plots will be saved. -
--test
: Make a test with all versions and create plots in the directory given with--outdir
For example you may use
python main.py -d 100000 -b 5 --dimension 3 --use_irregular_edges --outdir plots --GPU_shared
This command is going to create something like the following histogram:
This histogram should have 5 bins in each dimension but by using --use_irregular_edges
we get a randomized number of bins with 5 bins as mean (e.g. in z-dimension we have 7 bins and in x- and y-dimension there are 4 bins). In total there are 4 x 4 x 7 = 112 bins. We used 1e5 events where each event has 3 dimensions (therefore 3 x 1e5 doubles are sent to the GPU).