On the generation of microbenchmarks

Our current work consist on non-semantic preserving transformations. These transformations can be usefull when the data output admits somee degree of aproximation. I order to test if our transformations actually runs faster, we generate benchmarks automatically for the code being transformed. We use the environment provided by the unit tests of the project as base for these generated benchmarks.

The generation process goes as follows:

We detect all variables being used by the snippet.
All variables being declared outside the snippet are marked as the snippet's input
We instrument the original code to record the value of all inputs variables just before and after the execution of the snippet and store these values in file.
We generate a benchmark containing a setup method with Level.Trial that loads all the input values from file.
Inputs being modified during the benchmark run are restored in the begining of the benchmark.
A unit test is created to ensure that the benchmarked code is producing the same results than the original code

Example of a generated benchmark

@State(Scope.Thread)
public class com_jsyn_data_HammingWindow_26_ORIGINAL {

	static final String INPUT_ROOT_FOLDER = "/DATA/DIVERSE/logs/input-data";
	static final String INPUT_DATA_FILE = "com-jsyn-data-HammingWindow-26";

	public int i;
	public int length;
	public double[] data;
	public double alpha;
	public double beta;
	public double scaler;

	@Setup(Level.Trial)
	public void setup() {
		try {
			DataInputStream s = Loader.getStream(INPUT_ROOT_FOLDER, INPUT_DATA_FILE);

			length = Loader.readint(s);
			data = Loader.readArraydouble(s);
			alpha = Loader.readdouble(s);
			beta = Loader.readdouble(s);
			scaler = Loader.readdouble(s);

			s.close();
		} catch (Exception e) {
			throw new RuntimeException(e);
		}
	}

	@Benchmark
	public void doBenchmark() {
		for (int i = 0; i < length; i++) {
			data[i] = alpha - (beta * (java.lang.Math.cos((i * scaler))));
		}
	}
}

Example of the generated unit test for this benchmark:

public class com_jsyn_data_HammingWindow_26Test {

    static final String INPUT_ROOT_FOLDER = "C:/MarcelStuff/DATA/DIVERSE/logs/input-data";
    static final String INPUT_DATA_FILE = "com-jsyn-data-HammingWindow-26";

    @Test
    public void testOriginal() throws IOException {
        com_jsyn_data_HammingWindow_26_ORIGINAL benchmark =
            new com_jsyn_data_HammingWindow_26_ORIGINAL();
        benchmark.setup();
        benchmark.doBenchmark();

        DataInputStream s = Loader.getStream(INPUT_ROOT_FOLDER, INPUT_DATA_FILE);
        int length = Loader.readint(s);
        double[] data = Loader.readArraydouble(s);
        double alpha = Loader.readdouble(s);
        double beta = Loader.readdouble(s);
        double scaler = Loader.readdouble(s);
        
        s.close();
        
        assertEquals(benchmark.length, length);
        BenchAsserts.assertDoubleArrayEquals(benchmark.data, data);
        assertEquals(benchmark.alpha, alpha);
        assertEquals(benchmark.beta, beta);
        assertEquals(benchmark.scaler, scaler);

    }

}

Limitations:

Snippets containing something else than primitive types variables, their class counterparts and collections, are NOT benchmarked.
Snippets containing a dynamic method call to something else than a java.lang.collection class are NOT benchmarked.

Discussion

We must trust that the unit test of the original code are properly describing its correct functioning. We cannot ensure that our optimizations will work (or not) in an scenario not provided by these tests.
The JavaVMs are complex smart systems that optimize code in very clever ways. Since we want to test the code in isolation, we must be very carefull to allow optimizations that would apply in production code, as well as prevent those that would not occur. As an instance, in the example given, loop unrolling is considered an aceptable optimization since it may ocurr on production code, while dead code elimination may not. We hope to detect weird cases using the generated unit tests.
The generation is limited to certain cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On the generation of microbenchmarks

Example of a generated benchmark

Example of the generated unit test for this benchmark:

Limitations:

Discussion

Clone this wiki locally