From a2f262fadb80ad93f00b8e2500022db5aef958f2 Mon Sep 17 00:00:00 2001 From: Jelle van der Waa Date: Fri, 4 Aug 2023 15:12:38 +0200 Subject: [PATCH] test: add a document describing Cockpit's test Architecture --- test/ARCHITECTURE.md | 267 +++++++++++++++++++++++++++++++++++++++++++ test/README.md | 5 +- 2 files changed, 270 insertions(+), 2 deletions(-) create mode 100644 test/ARCHITECTURE.md diff --git a/test/ARCHITECTURE.md b/test/ARCHITECTURE.md new file mode 100644 index 000000000000..66d71c3522c7 --- /dev/null +++ b/test/ARCHITECTURE.md @@ -0,0 +1,267 @@ +# Architecture + +This document describes the architecture of Cockpit's browser integration +tests. The tests should replicate how a normal user interacts with Cockpit this +requires a test machine which can easily add have multiple disks or interfaces, +reboot, interact with multiple machines on the same network and run potentially +destructive test scenario's (e.g. installing/updating packages, formatting +disks). + +For these reasons, Cockpit tests run inside a virtual machine (VM). The virtual +machine uses Cockpit specific virtual machine images maintained and created in +the [bots](https://github.com/cockpit-project/bots) repository. The images are +usually based on a distribution's cloud image customized with: + +* A well-known password for the admin/root user +* Test SSH keys for access +* Test packages required to test Cockpit +* A build chroot with Cockpit's build dependencies to build the to be tested + Cockpit source inside the virtual machine offline. This allows a developer on + Fedora to easily prepare a Debian test image without having to install Debian + build tools. +* Disabling system services which interfere with testing + +To replicate a user, Cockpit is tested in a browser controlled using the +[Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/) +(CDP) which is supported by Firefox and Chromium based browsers. + +The Python test framework in `test/common` is responsible for setting up the +test environment, running tests and reporting of the test output. + +Diagram of the interaction of Browser/Machine/CDP/Test Framework. +```mermaid +graph TD; + id[Test Framework] <-->|CDP| Browser; + id[Test Framework] <-->|SSH| id1[Virtual Machine]; +``` + +## Integration Tests + +Cockpit's tests can be run via three different entry points: + +* `test/verify/check-$page` - run a single or multiple unit test(s) +* `test/common/run-tests` - run tests through our test scheduler (retries, tracks naughties) +* `test/run` - run tests in Continuous Integration (CI) + +We will start with how a single integration test is run and then explore the test +scheduler and CI setup. + +### Test runtime + +The base of a Cockpit integration test looks as following: + +```python +class TestApps(testlib.MachineCase): + def testBasic(self): + self.machine.execute("rm /usr/share/cockpit/apps") + self.browser.login_and_go("/apps") + self.browser.wait_not_visible("#apps") + +if __name__ == '__main__': + testlib.test_main() +``` + +In Cockpit there are two types of tests, `destructive` and `nondestructive` +tests. Destructive tests do something to test virtual machine which makes it +unable to run another test afterwards or requires another virtual machine for +testing. The test above is a `destructive` test which is the default, a non +`destructive` test makes sure any destructive action is restored after the test +has run as can be seen below. `nondestructive` tests where introduced to speed +up testing, as rebooting and shutting down a machine for every tests incurs a +significant penalty of ~ 10-30 seconds per test. + +```python +@testlib.nondestructive +class TestApps(testlib.MachineCase): + def testBasic(self): + self.restore_dir("/usr/share/cockpit/apps") + self.machine.execute("rm /usr/share/cockpit/apps") + self.browser.login_and_go("/apps") + self.browser.wait_not_visible("#apps") + +if __name__ == '__main__': + testlib.test_main() +``` + +The test above would be invoked via `./test/verify/check-apps TestApps.testBasic` +and would execute as can be seen in the diagram below: +```mermaid +sequenceDiagram + participant test + participant machine + participant browser + + test->>test: test_main() + test->>test: setUp() + test->>machine: start() + test->>machine: wait_boot() + test->>browser: __init__() + test->>test: setup non-destructive setup + test->>test: run test + test->>browser: start() + test->>test: tearDown() + test->>browser: kill() + test->>machine: kill() +``` + +A test starts by calling `test_main`, this provides common command line +arguments for debugging and to optionally run a test on a different +machine/browser. These arguments are available in the `MachineCase` class as +`opts`. `test_main` also takes care of instantiating a `TapRunner` which runs +all the specified tests sequentially. + +Once a test is started it runs `MachineCase.setUp` which has the responsibility +to start a virtual machines(s) depending on if it is a `nondestrcutive` or +`destructive` test. If we run a `nondestructive` test a global machine is +created, and re-used for other `nondestructive` tests which might run. For +`destructive` tests a machine is created on-demand, possible multiple machines +depending on the test class `provision` variable. + + For `nondestructive` tests cleanup handlers are installed to restore files in + `/etc`, cleans up home directories etc. + + Lastly a `Browser` class is instantiated, this does not start the Browser + directly but builds the required command for the `TEST_BROWSER` to start + either Chromium or Firefox. When a test calls any method on the browser object + the browser will be started, so tests which require no browser don't start a + browser needlessly. + +The `CDP` class is responsible for spawning the browser, then spawning a CDP +driver this uses the `chrome-remote-interface` npm module to send commands to +the spawned drivers via standard in (stdin). + +On `tearDown` the test status is inspected, if it failed test logs are +collected and if the user has passed `--sit` the test pauses execution until +the user presses enter so that the machine/browser state can be inspected. The +test browser is killed after the `tearDown` function completed. + +Virtual machines are killed by the `TapRunner` once all tests have finished or +in `setUp` if it's a `destructive` test as `nondestructive` tests re-use the +existing global machine. + +### Test runner + +Cockpit uses a custom test runner to run the tests, spread the load over jobs +and special handling of test failures. The test runner is implemented in +Python in `test/common/run-tests` and expects a list of tests to be provided. + +The provided tests are collected and split up in `serial` and `parallel` tests +and initialized as `Test` object. `serial` tests are what our test library +calls `non-destructive` tests, `parallel` tests are `destructive` tests (the +default). If there are any changes compared to the `main` branch, the test +runner checks if any of the tests changed if so they are added to the affected +test list unless more then three tests are changed. If `pkg/apps` is changed, +`test/verify/check-apps` will also be added to the affected test list + +After having collected the parallel, serial and affected tests a scheduling +loop is started, if a machine was provided it is used for the serial tests, +parallel tests will always spawn a new machine. If no machine is provided a +pool of global machines is created based on the provided `--jobs` and serial +tests. The test runner will first try to assign all serial tests on the +available global machines and start the tests. + +A test is started by the `Test` class by calling the `start()` method executed +the provided `command` (e.g. `./test/verify/check-apps --machine 127.0.0.1:2201 +--browser 127.0.0.1:9091`) with a `timeout` to cancel hanging tests after a +timeout automatically and creates a temporarily file to store the results of +the test `command` in. Finally the test is added to the `running_tests` list. + +The test runner inspects all `running_tests` in a tight loop and polls if the +test is still running. If the process stopped or exited the test output and +`returncode` is saved. Depending on the `returncode` the test runner makes a +decision on what to do next as described in the diagram below. +```mermaid +graph TD; + finished[Test finished] + succeeded{Test succeeded?} + affected{"Test affected?"} + affecteddone["Retry three times"] + skipped{Test skipped?} + todosuccess{Test todo?} + todonosuccess{Test todo?} + todosucceeded["Unexpected success + show as failure"] + todofail[Expected failure] + pixel_journal_failure{"Pixel test or + unexpected journal + message?"} + retry["Retry three times + to be robust against + test failures"] + testfailed["Test failed"] + failure_policy{"Known issue?"} + + finished --> succeeded + succeeded --> |Yes| skipped + succeeded --> |No| todonosuccess + + skipped --> |Yes| skiptest[Show as skipped test] + skipped --> |No| affected + + affected --> |Yes| affecteddone + affected --> |No| todosuccess + + todosuccess --> |Yes| todosucceeded + todosuccess --> |No| done[Test succeeded] + + todonosuccess --> |Yes| todofail + todonosuccess --> |No| failure_policy + + failure_policy --> |Yes| known_issue[Show as known issue] + failure_policy --> |No| pixel_journal_failure + + pixel_journal_failure --> |Yes| testfailed + pixel_journal_failure --> |No| retry +``` + +* `Skipped` - tests can be skipped because the test can not run on the given `TEST_OS`. +* `Affected` - tests to be retried to make sure any changes affecting them do not lead to flaky tests. +* `Todo` - tests which are incomplete and expected to fail. +* `Known issue` - Naughties are expected test failures due to known issues in software we test, + we still run the test but if the test error output matches a known naughty it + is skipped. The [bots](github.com/cockpit-project/bots) repository keeps track + of all our known naughties per distro. (The bots repository has automation + setup to see if a naughty is still affected and if not open a pull request to + drop it). +* `Failed` - tests can fail due to our test shared infrastructure, instead of + letting the whole test run fail, we re-try them unless `--no-retry-fail` is + passed. + +### Continuous Integration (CI) + +In CI we have two entry points, one for our tests which runs on our own +managed infrastructure by [cockpituous](https://github.com/cockpit-project/cockpituous/) +and one for tests which run on the [testing farm (TF)](https://docs.testing-farm.io/). + +For our own managed infrastructure the entry point of the Cockpit tests is +`test/run` this bash script expects a `TEST_OS` environment variable to be set +to determine what distribution to run the tests under and a `TEST_SCENARIO` +environment variable to determine the type of test. Currently we support these +different scenario's: + +* `devel` - runs tests with coverage enabled and generates a html file with + coverage information +* `pybridge` - runs tests with the Python bridge (soon to be deprecated after all + images are moved over to the Python bridge) +* `firefox` - runs tests using the Firefox browser instead of Chrome +* `networking` - runs all networking related tests +* `storage` - runs all storage related tests +* `expensive` - runs all expensive tests (usually tests which reboot/generate a new initramfs) +* `other` - runs all non-networking/storage/expensive tests. + +Cockpit's tests are split up in scenario's to heavily parallelize our testing and +allow for faster retrying. + +The `test/run` prepares an Virtual machine image for the given `TEST_OS` and then +runs the tests by calling `test/common/run-tests` with the provided tests. + +For the Testing Farm (TF) scenario, Packit is responsible for setting up the +test infrastructure and the entry point is `test/browser/browser.sh`. On TF we +get a single virtual machine without a hypervisor so tests run on the virtual +machine directly this also implies that only `destructive` tests can be run. +The `test/browser/browser.sh` script sets up the virtual machine and calls +`test/browser/run-tests.sh` which selects a subset of all the `nondestructive` +tests to run using `test/common/run-tests`. + +`TF` scenario's are also split up into scenario's (basic, networking, optional) +to run faster in parallel. diff --git a/test/README.md b/test/README.md index cc06aff3e413..f7015e503c02 100644 --- a/test/README.md +++ b/test/README.md @@ -1,7 +1,8 @@ # Integration Tests of Cockpit -This directory contains automated integration tests for Cockpit, and the support -files for them. +This directory contains automated integration tests for Cockpit, and the +support files for them. The architecture of the automated integration tests is +described in [ARCHITECTURE](./ARCHITECTURE.md) To run the tests on Fedora, refer to the [HACKING](../HACKING.md) guide for installation of all of the necessary build and test dependencies. There's