Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure tests are deterministic and faster #77

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

gonzalo-bulnes
Copy link
Contributor

@gonzalo-bulnes gonzalo-bulnes commented Oct 2, 2023

Overview

This PR does two related things in three steps:

  • Fix a small oversight in the comparison of elapsed time to sampling time.
  • Introduce the use of simulated time in the test suite, which enables:
    • deterministic tests
    • faster tests (full run under a second)
  • Take a temporary step to optimize waiting time in tests. That becomes irrelevant after the move to simulated time, but I thought it was interesting to put the speed-up numbers in perspective.

It also suggests the removal of some code, because the time_fn option makes the dt option redundant (as demonstrated by the use of simulated time in the test suite). Context in #10. (:warning: That's very much YMMV, admittedly, and I can't judge if removing it breaks backwards-compatibility, it's your definition that matters, not mine. See commit message for more details.)

🔮 There are extensive comments in the commit messages.

Fixes #70

Implementation

I happen to have written a drop-in replacement for time.time() and time.monotonic() some time ago, that provides simulated time for testing purposes. (stime on GitHub, and PyPi)

The introduction of the time_fn option to inject the time function enables the best case scenario to introduce stime.

This is the first time I actually use this little package, and I'm pretty happy with how it fits this project. (Fun fact: I wrote as a simple pretext to learn how to write a Python package, at a time I thought I'd need it eventually when implementing a PID, before I found this package. So the projects were somehow related in my mind from the start. Needless to say I never ended up writing that PID package.)

The general idea is:

  1. import stime
  2. (Re-)set the clock to whatever time (0 is as good a choice as any other Unix epoch) with stime.reset(0).
  3. Whenever useful, advance time by one second stime.tick() or any arbitrary value stime.tick(0.1), or reset it at will.

That's it.

Outcome

Unless I'm misunderstanding something about how the PID works, the current test suite should now be deterministic.

The overall speed of the test suite has increased significantly. For reference, an entire test suite run on my machine went from 2m14, to 26s, to 200ms. That's a roughly 600x speed-up, of which I'd consider the last 100x factor meaningful.

I also think that tests with simulated time are easier to reason about, because the test code itself doesn't affect the timings. (That's another effect of the same property that enables writing deterministic tests.)

Checklist

  • Tests
  • Linting and formatting
  • Docs / explanations

The inaccuracy is small, but it becomes visible when operating
within tight margins of the `sample_time`, like for example, when
working with a simulated clock in a test environment.

Specifically, `dt` cannot be exactly zero because it is used as
a divider. When it would be zero, it is set to a very small,
conventional value instead, that I've renamed MIN_DT for clarity.

When `dt == MIN_DT` it is really representing the case when it
would be zero. Not taking that value into consideration when
comparing the elapsed time since last computation does effectively
increase `sample_time` by `MIN_DT`. It is not a lot, but it is not
accurate either.

By correcting the comparison, very precise use of `sample_time`
becomes possible. One concrete use case is using simulated time
for testing.
Unless I'm misunderstanding something, the time necessary
for the system to converge should be stable, for any given
configuration of the PID and a given runtime environment.

I notice that the values 120s and 12s are close enough that
120 might have been a typo? Either way, my testing suggests
that 10s is sufficient for the system to converge on my machine.

However, the reliance on wall time for these tests means that
how fast the code is being executed does affect that value.
Before we fix that, I think that the 20% margin provided by
the existing value of 12s is reasonnable.

On my machine, this change reduces the overall time to run
the tests suite by a factor of five (from ~2m14 to ~26s).
@gonzalo-bulnes
Copy link
Contributor Author

gonzalo-bulnes commented Oct 2, 2023

This is ready for review on my end, let me know if you'd like any additions / removals / editions @m-lundberg, or generally if that's what you had in mind with #70. 🙂

Edit: I read my own docs, and used less stime.reset() in favor of stime.tick(). 👇

This has a few advantages:

- Since time is frozen unless explicit resets or ticks happen,
  how long the test code takes to execute does not affect the
  test results. Specifically, this means that the `sample_time`
  can be matched precisely, and that makes for tests that are
  easier to reason about.

- Since time is controlled, it can flow at any speed we want.
  Two consecutive calls to `stime.tick()` will make two seconds
  pass as fast as your machine can execute the two statements.

The first advantage makes this test suite deterministic (because
the PID configuration is stable in convergence tests).

The second speeds up the overall execution of the test suite
by a factor of a hundred on my machine (from ~26s to ~200ms).
The `dt` option was added when `time_fn` was not available to
initialize PID instances.

Now that `time_fn` allows the time function to be injected as a
dependency, the `dt` option is redundant. It could be kept around
for backward compatibility, but I'd personally remove it because
IMHO the simpler the code the better.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve tests
1 participant