Skip to content

Latest commit

 

History

History
75 lines (52 loc) · 4.33 KB

README.md

File metadata and controls

75 lines (52 loc) · 4.33 KB

CuDNN Convolution Benchmark

Prerequisites

Building

mkdir build
cd build
cmake ..
make

Note: The most recent cuDNN distribution will be obtained automatically by installing PyTorch into the build directory.

Usage

Run all tests at once:

cd build
ctest

Run individual test, providing the following arguments:

$ ./bin/benchmark file_name data_type all_formats operation_mode num_repeats [input_tensor_format output_tensor_format kernel_tensor_format]
  • file_name: path to the file with convolution cases (example);
  • output_file_name: path to the output file with benchmark results;
  • data_type: data type used (accepted values are fp16, fp32, fp64, int8, uint8, int32, int8x4, uint8x4, uint8x32);
  • all_formats: 1 if all input/output/tensor formats should be tested, 0 to run with specific data formats only;
  • num_repeats: number of repetitions for each convolution algorithm.

If all_formats is set to 0, the following additional arguments must be specified:

  • input_tensor_format: input tensor data format (accepted values are NCHW, NHWC, NCHW_VECT_C);
  • output_tensor_format: output tensor data format (accepted values are NCHW, NHWC, NCHW_VECT_C);
  • kernel_tensor_format: kernel tensor data format (accepted values are NCHW, NHWC, NCHW_VECT_C).

Examples:

  • Test specific data formats:
$ ./bin/benchmark conv_example.txt out_example.csv fp32 0 100 NHWC NHWC NHWC
  • Test all data formats:
$ ./bin/benchmark conv_example.txt out_example.csv fp32 1 1000

Obtaining results

Running the benchmark produces a output_file_name file in your working directory.

Example contents for ./bin/benchmark conv_example.txt out_example.txt fp32 0 1 10 NCHW NCHW NCHW:

A value n/a means that the combination of the input tensor dimension, filter tensor dimension and output tensor dimension is not supported for the specified algorithm on your GPU.

A value - means that this convolution not supported for the specified algorithm on your GPU.

input_format	output_format	filter_format	W	H	C	N	K	S	R	pad_w	pad_h	stride_w	stride_h	out_w	out_h	input_stride_w	input_stride_h	filter_stride_w	filter_stride_h	FWD_GEMM	FWD_GEMM WORKSPACE	FWD_IMPLICIT_GEMM	FWD_IMPLICIT_GEMM WORKSPACE	FWD_PRECOMP_GEMM	FWD_PRECOMP_GEMM WORKSPACE	FWD_DIRECT	FWD_DIRECT WROKSPACE	FWD_FFT	FWD_FFT WORKSPACE	FWD_FFT_TILING	FWD_FFT_TILING WORKSPACE	FWD_WINOGRAD	FWD_WINOGRAD WORKSPACE	FWD_WINOGRAD_NONFUSED	FWD_WINOGRAD_NONFUSED WORKSPACE	BWD_FILTER_ALGO_0	BWD_FILTER_ALGO_0 WORKPACE	BWD_FILTER_ALGO_1	BWD_FILTER_ALGO_1 WORKSPACE	BWD_FILTER_ALGO_3	BWD_FILTER_ALGO_3 WORKSPACE	BWD_FILTER_FFT	BWD_FILTER_FFT WORKSPACE	BWD FILTER FFT_TILING	BWD FILTER FFT_TILING WORKSPACE	BWD_DATA_ALGO_0	BWD_DATA_ALGO_0 WORKSPACE	BWD_DATA_ALGO_1	BWD_DATA_ALGO_1 WORKSPACE	BWD_DATA_FFT	BWD_DATA_FFT WORKSPACE	BWD_DATA_FFT_TILING	BWD_DATA_FFT_TILING WORKSPACE	BWD_DATA_WINOGRAD	BWD_DATA_WINOGRAD WORKSPACE	BWD_DATA_WINOGRAD_NONFUSED	BWD_DATA_WINOGRAD_NONFUSED WORKSPACE
NCHW	NCHW	NCHW	1	1	256	32	324	3	3	1	1	1	1	1	1	1	1	1	1	370.914	294912	471.803	0	587.052	9216	n/a		15303.9	212963328	42105.4	441769984	7298.59	8754448	2435.43	14616576	8761.88	0	333.403	6336	8901.35	0	n/a		n/a		3661.55	0	1157.1	0	11922.6	217976832	38886.3	441769984	6682.76	8360960	2169.12	14616576	
NCHW	NCHW	NCHW	1	1	256	32	16	3	3	1	1	1	1	1	1	1	1	1	1	231.894	294912	457.18	0	452.277	9216	n/a		2915.1	28459008	9094.39	55730176	562.581	671808	417.447	1843200	369.539	0	43.1993	576	366.571	0	n/a		n/a		494.008	0	291.386	0	1501.37	19021824	4851.48	55730176	649.183	410624	398.059	1843200	
NCHW	NCHW	NCHW	3	3	256	32	324	3	3	1	1	1	1	3	3	1	1	1	1	1474.53	2654208	2193.58	0	1353.02	60	n/a		15177.2	212963328	43215.7	441769984	3892.56	8754448	2453.19	14616576	9273.82	0	1711.97	2572	1770.54	2356	13231.9	191476224	n/a		4373.88	0	1019.39	2236	11924.6	217976832	39019.9	441769984	3586.94	8360960	2173.33	14616576	
NCHW	NCHW	NCHW	3	3	256	32	16	3	3	1	1	1	1	3	3	1	1	1	1	348.08	2654208	652.083	0	301.795	60	n/a		2989.4	28459008	9137.38	55730176	421.972	671808	420.687	1843200	423.354	0	139.508	2428	126.237	2356	1967.64	20072448	n/a		965.804	0	123.734	2236	1342.91	19021824	4955.75	55730176	352.066	410624	411.616	1843200	
.....