Estimate the search space #175

symbiont-stevan-andjelkovic · 2021-03-19T11:06:47Z

In order to display progress and to gauge how well our lineage-driven optimisations are working we should try to estimate the search space.

For a first draft, that also includes notes about the tricky parts, see #174.

symbiont-stevan-andjelkovic · 2021-03-26T11:12:48Z

In PR #184 we added an explanation of how to estimate the random search fault
space given a failure specification, given one concrete test run.

There are several ways to improve on this.

For starters we need to figure out how many interesting concrete programs there are
to being with. Typically property based testing focuses on this aspect, i.e.
generate random test cases, but because in distributed systems a very small test
case, e.g. a write followed by a read, has so much non-determinism this aspect
is less important than exploring the non-determinism.

It would also be interesting to estimate what the fault space is if the random
search is augmented with information about the network trace (i.e. which events
actually happened in the concrete run) and memory of which faults it already
tried in previous runs. This would make it closer to what lineage-driven fault
injection (LDFI) does, but with very simple constraints (don't try what you've
already tried, as opposed to try to eliminate possibilities based on what we
know). The tricky part here is that as we introduce faults, they might trigger
retries which spawn new messages that were not in the original network trace. We
could perhaps make an estimate given that no new messages are spawned at the
start of the test, and then in addition calculate how many new messages are
spawned per run on average and how much they enlarge the fault space as we run
the test.

To make the estimate closer to what actually happens during LDFI we would need
to take all the constraints it produces into account, which seems hard. A
possible alternative is to get some statistics from the SAT solver itself about
its estimated search space? This would enable us to compare how efficient LDFI
is verses random search or even one version of LDFI vs another with slightly
tweaked constraints.

It would also be interesting, but hard, to take symmetries or equivalence
classes of states into account, e.g. doesn't it really matter which of the 4
follower nodes we choose to crash?

A simpler problem would be to run a sweep search where we start with a small
failure specification and iteratively make it bigger until we reach some
timeout, and basically see how far we get as a measure of how efficient it is
for a bug free program. By analogy this would be closer to a benchmark, while
properly estimating the search space is closer to calculating the complexity of
the algorithm.

symbiont-stevan-andjelkovic · 2021-04-06T09:24:29Z

For an explanation of how LDFI cuts down the fault space and how to estimate the number of runs needed see #196.

symbiont-stevan-andjelkovic · 2021-04-06T11:02:21Z

For starters we need to figure out how many interesting concrete programs there are to being with.

One possible way to compute this would be to: identify test case equivalence classes up to a certain depth, e.g. write(k1, v1) || write(k2, v2); ... is equivalent to write(k2, v2) || write(k1, v1); .... This can perhaps be done by generating a new random test case, run it to produce a history and a model, hash the response part of the history and the model, repeat, all test cases with the same hash are in the same equivalence class.

symbiont-stevan-andjelkovic added this to the v0.1 milestone Mar 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Estimate the search space #175

Estimate the search space #175

symbiont-stevan-andjelkovic commented Mar 19, 2021

symbiont-stevan-andjelkovic commented Mar 26, 2021 •

edited

Loading

symbiont-stevan-andjelkovic commented Apr 6, 2021

symbiont-stevan-andjelkovic commented Apr 6, 2021

Estimate the search space #175

Estimate the search space #175

Comments

symbiont-stevan-andjelkovic commented Mar 19, 2021

symbiont-stevan-andjelkovic commented Mar 26, 2021 • edited Loading

symbiont-stevan-andjelkovic commented Apr 6, 2021

symbiont-stevan-andjelkovic commented Apr 6, 2021

symbiont-stevan-andjelkovic commented Mar 26, 2021 •

edited

Loading