Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multihost tests #726

Closed
lukaszachy opened this issue Apr 22, 2021 · 17 comments
Closed

Support multihost tests #726

lukaszachy opened this issue Apr 22, 2021 · 17 comments
Assignees
Labels
area | multihost Issues related to the multihost testing support status | discuss Needs more discussion before closing
Milestone

Comments

@lukaszachy
Copy link
Collaborator

There is no way how to run tests in as multihost nor mark them as such.
This ticket is for gathering requirements, discussing solutions and finally reaching the consensus about the tmt way for multihost.

@lukaszachy lukaszachy added enhancement status | discuss Needs more discussion before closing labels Apr 22, 2021
@lukaszachy
Copy link
Collaborator Author

One of known solutions: https://beaker-project.org/docs/user-guide/multihost.html

@thrix
Copy link
Collaborator

thrix commented Apr 22, 2021

I would love if we could support multihost tests in also another fashion. This builds on top of the idea in #694 (comment)

# name of the provisioned resource specifies a also a hostname added to /etc/hosts on the provisioned machines
# only some of the provision drivers support multihost (with container it would blow up early as unsupported)
provision:
  - name: server
    how: virtual
  - name: client
    how: virtual
prepare:
  - name: setup server
    how: ansible
    on: server
    playbooks: setup_server.yml
  - name: setup client
    how: ansible
    on: client
    playbooks: setup_server.yml
execute:
  - name: test on server
    on: server
    how: script:
    script: ping -c1 client
  - name: test on client
    on: client
    how: script
    script: ping -c1 server

@jscotka
Copy link
Collaborator

jscotka commented Apr 22, 2021

My proposed solution is just to extend provision in plan like and then just run tests on all machines (as current multihost behaviour in beaker):

/plans:
  /provision:
    - name: machine1
      environment:
        MULTIHOST: CLIENT
    - name: machine
      environment:
        MULTIHOST: SERVER

/test:
  test: runtest.sh

and then it will provision two machines, with behaviour as current in sense no provisioner added use VM, with latest fedora.
and from command line syntax like: tmt run -a provision -h minute -i rhel-8 provision -h connect -g 192.168.1.2
in case e.g. simple tmt run it will provision two VMs with latest fedora

Magic around that is that environment, e.g. special env var like MULTIHOST set this value inside machine and also sets all other machines IP to other vars: so that test execution will have set env values like:

MULTIHOST=SERVER
SERVER=$IP1
CLIENT=$IP2

I can imagine also not use env var, but use special key, but this is more less equivalent solution

and also in CI, TMT provision is replaced by -a provision -h connect to some machine. it just create two instances. Question is about extendability for arch/dist spec testing. But this is more less about CI implementation, which attributes it will use and which not and abilities what current provisioner is able to support.
e.g.

/provision:
   - name: machine1
     distro: rhel7
     arch: s390x
   - name: machine2

what could test current distro against specified one on specified arch.

EDIT:
prepare will have also setup these env vars, so that script itself could change behaviour based on these variables

I can imagine this basic solution may be relative easy and fast to implement, just improve prepare to be able to provision more machines and then just enhance environment vars for executing commands.

@thrix
Copy link
Collaborator

thrix commented Apr 27, 2021

@jscotka yeah, this is closer to what restraint does now, I believe we can support it also, maybe even does not collide with my proposal. If you do not provide on: it will run it on both ...

@psss psss added this to the 2.0 milestone Apr 27, 2021
@psss
Copy link
Collaborator

psss commented Jun 22, 2021

Summary from today's hacking session brainstorm:

  • The syntax outlined by @thrix sounds like a good start
    • The on keyword would be optional
    • If not provided, all guests would be considered, e.g. common prepare
  • Two approaches discussed
    • Backward compatibile using rstrnt-sync-set & rstrnt-sync-block
    • Orchestrate testing from a single script on the test runner
  • The multihost config to be stored in plans (L2 metadata)
    • Support for storing multihost config in tests (L1) possibly in the future
  • There is a beakerlib library with enhanced syncing support
    • Does not need manual intervention during debugging

Follow-up meeting next week to gather user cases and consult with multihost users.

@The-Mule
Copy link
Contributor

The-Mule commented Sep 7, 2021

PR#878 should hopefully implement spec changes according to suggestions from both thrix and jscotka (specifying 'role' provision option and 'on' prepare and execute options).

@psss
Copy link
Collaborator

psss commented Sep 16, 2021

The spec draft in #878 has been updated based on the last week's discussion. Please, have a look and provide feedback.

@psss
Copy link
Collaborator

psss commented Oct 27, 2021

The specification #878 has been merged some time ago.
The multiple provision and prepare support #896 is in review.
We need to migrate fmf to ruamel.yaml to prevent ugly hacks for the on keyword.

@pcahyna
Copy link
Collaborator

pcahyna commented Jan 4, 2022

removing on keyword: #992

@kkaarreell
Copy link
Collaborator

In RedHat-SP-Security/keylime-tests#35 I have implemented simple sync mechanism similar to the beaker/restraint one. Currently each test system expose its reached states using ncat on TCP port so that others can read it.
My plan was also to implement dedicated "provider" that would gather and expose states on behalf of test systems. So far I do not have a need for it but I guess tmt may need something similar in order to synchronize beginning of a multihost test on individual test systems. I think that for tmt one may want to implement this functionality in Python directly but the existing library can at least serve as an inspiration.
At last, with this library community contributors can run multihost keylime tests (since no beaker/restraint sync orchestration is needed) manually with tmt doing the required setup. This is described in
https://github.com/RedHat-SP-Security/keylime-tests/blob/main/TESTING.md#running-multi-host-tests
In this case it is recommended to refer hosts using IP addresses since virtual systems provided by tmt have same hostname.

@kkaarreell
Copy link
Collaborator

Hi @psss @happz
would it be possible to summarize here briefly the current status of multi-host test support and possibly even outline the next steps? This ticket seems quite outdated so it would be nice to have an update here (or close it).

@happz
Copy link
Collaborator

happz commented Jun 6, 2023

I can put something together tomorrow, a wrap-up before the release sounds like a good point.

@happz
Copy link
Collaborator

happz commented Jun 7, 2023

The current state of multi-host test support in tmt

What's been done

Tl;dr: support for basic server/client scenarios has been implemented and tested.

Now more details:

  • prepare, execute, and finish steps can now run a given task (test, preparation script, ansible playbook, etc.) on several guests at once, which is the very basic requirement for tests dealing with server vs client interaction

  • tasks are assigned to provisioned guests by matching the where key from discover, prepare and finish phases with corresponding guests by their name or role keys. Basically, the user tells tmt on which guest(s) a test or script should run by listing guest name(s) or guest role(s).

  • the granularity of the multi-host scenario is on the step phase level. The user may define multiple discover or prepare phases, and everything in them will start on given guests at the same time when the previous phase completes. The practical effect is, tmt does not manage synchronization on the test level:

    - name: server-setup
      how: fmf
      test:
        - /tests/A
      where:
        - server
    
    - name: tests
      how: fmf
      test:
        - /tests/B
        - /tests/C
      where:
        - server
        - client

    In the example below, first, everything from the server-setup phase would run on guests called server, while guests with the name or role client would remain idle. When this phase completes, tmt would move to the next one, and run everything in tests on server and client guests. The phase would be started at the same time, more or less, but tmt will not even try to synchronize the execution of each test from this phase. /tests/B may still be running on server when /tests/C is already completed on client. If there is a need for synchronization between tests on various guests, it's the responsibility of those tests, and various synchronization libraries are available.

  • tmt exposes information about guests and roles to all three steps in the form of two files, see https://tmt.readthedocs.io/en/latest/spec/plans.html#guest-topology-format. These files can be ingested by tests and used for contacting their peers, synchronization, etc.

  • tmt fully supports one test being executed multiple times. This is especially visible in the upgraded format of results, see https://tmt.readthedocs.io/en/latest/spec/plans.html#results-format. Every test is assigned a "serial number", if it appears in multiple discover phases, each instance would be given a different serial number, plus tmt records from which guest the result originates.

What's missing

The up-to-date picture would be pictured by listing issues and PRs with the https://github.com/teemtee/tmt/labels/multihost label, but the summary would be:

  • requirements of all tests are installed on all guests. See Make collected requires/recommends guest-aware #2010.
  • test-level synchronization - as described above, tmt handles phase-level synchronization, and this is probably not going to change any time soon. For test-level synchronization, please use dedicated libraries.
  • interaction between guests provisioned by different plugins. Think "a server in container vs client from virtual". There are several issues, this is not yet supported.
  • provision step is still run in sequence, guests are provisioned one by one. This is not technically necessary, and with tools we now have for handling parallelization of other steps, provision deserves the same treatment, resulting in, hopefully, a noticeable speed up (especially with plugins like beaker or artemis).

Links

@kkaarreell
Copy link
Collaborator

Hi @psss @lukaszachy @happz
There has been quite a progress in the lasts year and even the latest update from Milos is now outdated as some things have been resolved. Would it make sense either to update the status (probably including the initial description as this is what people see at first) or even close this and track unfinished tasks as separate tickets?

@lukaszachy
Copy link
Collaborator Author

+1 to close this one

@psss
Copy link
Collaborator

psss commented Apr 9, 2024

I believe the What's missing section is now almost covered, except for test level sync and interaction between guests. Those can be tracked separately. Documentation is ready as well:

Ack for closing this, @happz, do you agree?

@psss psss added the area | multihost Issues related to the multihost testing support label Apr 9, 2024
@psss
Copy link
Collaborator

psss commented Apr 10, 2024

Ok, closing. Updated milestone to 1.31 as both #2468 and #2567 which were the two last significant changes related to multihost support were addressed there.

@psss psss closed this as completed Apr 10, 2024
@psss psss modified the milestones: 2.0, 1.31 Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area | multihost Issues related to the multihost testing support status | discuss Needs more discussion before closing
Projects
None yet
Development

No branches or pull requests

8 participants