Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loadgroup: Don't modify test nodeids any longer. #1119

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

criemen
Copy link
Contributor

@criemen criemen commented Aug 21, 2024

The loadgroup scheduler used to modify the nodeids of tests that were placed manually in a specific scope. This was used as a side-channel from test collection (on the pytest-running process) to the worker process. Instead, make that side-channel explicit by passing a parameter around.
That's pretty ugly, but no less ugly than modifying and parsing (!) node IDs later on.

I'm not so keen on the "protocol" between the test hook and the schedulers, as we're currently just writing into a field of the currently active scheduler with this implementation.
I previously considered adding this as an extra argument to add_node_collection, but that then requires touching all schedulers for something that's only relevant to a single one. Other ideas are welcome here in code review!

This is motivated by #1118.

Thanks for submitting a PR, your contribution is really appreciated!

Here's a quick checklist that should be present in PRs:

  • Make sure to include reasonable tests for your change if necessary

  • We use towncrier for changelog management, so please add a news file into the changelog folder following these guidelines:

    • Name it $issue_id.$type for example 588.bugfix;

    • If you don't have an issue_id change it to the PR id after creating it

    • Ensure type is one of removal, feature, bugfix, vendor, doc or trivial

    • Make sure to use full sentences with correct case and punctuation, for example:

      Fix issue with non-ascii contents in doctest text files.
      

The loadgroup scheduler used to modify the nodeids of
tests that were placed manually in a specific scope.
This was used as a side-channel from test collection (on the pytest-running process)
to the worker process. Instead, make that side-channel explicit by passing
a parameter around.
That's pretty ugly, but no less ugly than modifying and parsing (!)
node IDs later on.
@criemen
Copy link
Contributor Author

criemen commented Aug 21, 2024

I'll fix the mypy complaints once we've decided on a viable approach for passing the info into the scheduler. Changelog is coming then, too.

@criemen criemen marked this pull request as ready for review August 21, 2024 21:06
@criemen
Copy link
Contributor Author

criemen commented Aug 21, 2024

It's maybe worth saying that I first wanted to put the scopes from the test collection into config.stash, but that's not possible, as there's the execnet indirection between the process where the pytest test collection runs on (presumably a "worker"), and the actual process that runs the test loop. That's been a bit frustrating, and as someone who's just using -nauto, that's a feature that's complicating the implementation of xdist needlessly.

@RonnyPfannschmidt
Copy link
Member

A communication channel for those details has to be created, it's time consuming and coordination heavy

Hence the lack of a implementation of that so far

Copy link
Member

@RonnyPfannschmidt RonnyPfannschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, instead of a full communication channel that includes markers and other details
We pass a dedicated value to enable this usecase

Let's iterate a bit on the names passed to ensure intent is clear without the reader knowing All the details

In future we want to pass richer details on the collected tests so scheduled can use fixture scopes, parameters and active fixtures as well for decision, but that needs the previously mentioned Design proces

Your solution nicely contains the hack

if len(mark.args) > 0
else mark.kwargs.get("name", "default")
)
loadgroup_scopes[item.nodeid] = gname
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An important Note here is that unfortunately duplicate node ids are only discouraged,not enforced

Which is the reason for using the node index instead of the node id when communicating using the scheduler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pretty bad. What's your opinion, does this need to be addressed, or can we live with duplicate node ids being scheduled on the same node, per xdist_group, even only one test had the annotation?

@criemen
Copy link
Contributor Author

criemen commented Aug 22, 2024

I just pushed a change that renames the parameter. I thought that we can make this information available to all schedulers, and we could even change the whole load* family to use the grouping information by default, and deprecate theloadgroup strategy.
I'd also love to respect the grouping in worksteal, but that looks a little bit trickier to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants