Skip to content

Commit

Permalink
mention less Python
Browse files Browse the repository at this point in the history
  • Loading branch information
jelly committed Sep 13, 2023
1 parent 93ebcee commit f0976b3
Showing 1 changed file with 18 additions and 19 deletions.
37 changes: 18 additions & 19 deletions test/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,33 +143,32 @@ existing global machine.
### Test runner

Cockpit uses a custom test runner to run the tests, spread the load over jobs
and special handling of test failures. The test runner is implemented in
Python in [test/common/run-tests](./common/run-tests) and expects a list of tests to be provided.
and special handling of test failures. The test runner is implemented in Python
in [test/common/run-tests](./common/run-tests) and expects a list of tests to
be provided.

The provided tests are collected and split up in "destructive" and
"non-destructive" tests and initialized as `Test` object. If there are any
changes compared to the `main` branch, the test runner checks if any of the
tests changed: if so they are added to the affected test list unless more then
three tests are changed. If `pkg/apps` is changed, `test/verify/check-apps`
will also be added to the affected test list

After having collected the "destructive", "non-destructive" and affected tests a
"non-destructive" tests, every test is given a "cost" depending on the amount
of virtual machines required to run the test for scheduling priority. A default
timeout is added to every test so hanging tests get killed over time, tests
which take longer can set a custom timeout using a test decorator.

To make sure our tests aren't flaky we retry tests three times when they are
affected. An affected test is for example `test/verify/check-apps` if
`pkg/apps` is changed, or if the changeset changes less then four tests they
will be marked as affected as well. The changed tests are limited so big
refactors won't retry a lot of tests.

Having collected the "destructive", "non-destructive" and affected tests a
scheduling loop is started, if a machine was provided it is used for the
"non-destructive" tests, "destructive" tests will always spawn a new machine. If no
machine is provided a pool of global machines is created based on the provided
`--jobs` and "non-destructive" tests. The test runner will first try to assign all
"non-destructive" tests on the available global machines and start the tests.

A test is started by the `Test` class by calling the `start()` method executed
the provided `command` (e.g. `./test/verify/check-apps --machine 127.0.0.1:2201
--browser 127.0.0.1:9091`) with a `timeout` to cancel hanging tests after a
timeout automatically and creates a temporarily file to store the output of
the test `command` in. Finally the test is added to the `running_tests` list.

The test runner inspects all `running_tests` in a tight loop and polls if the
test is still running. If the process stopped or exited the test output and
`returncode` is saved. Depending on the `returncode` the test runner makes a
decision on what to do next as described in the diagram below.
The scheduling loop periodically inspects all running tests and polls if the
test has ended. Depending on the exit code of the test process the test runner
makes a decision on what to do next as described in the diagram below.
```mermaid
graph TD;
finished[Test finished]
Expand Down

0 comments on commit f0976b3

Please sign in to comment.