Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-24.3: roachtest: mark polled VM preemptions as non reportable #141136

Merged
merged 2 commits into from
Feb 11, 2025

Conversation

DarrylWong
Copy link
Contributor

Backport 2/2 commits from #139075.

/cc @cockroachdb/release


Previously, polled VM preemptions would simply cancel the test, as post test processing would recheck for preemptions again. However, we've seen some cases in AWS where the post test check returns no preemptions despite the polling returning preemptions.

This may be just be the AWS check being eventually consistent, so we want to avoid posting if either check finds preemptions.


The second change resets failures in the case of a vm preemption, in case a timeout occurred which normally takes precedence over all other failures. While a timeout suggests that something should be fixed with the test (usually respecting the test context cancellation), we see that in practice, engineers tend to close the issue without investigating as soon as they see the preemption.

This also removes the potential duplicate vm_preemption failure that may have been added by the preemption polling.

Fixes: #139004
Fixes: #139931
Release note: none
Epic: none

Release Justification: Test infra change

Previously, polled VM preemptions would simply cancel the
test, as post test processing would recheck for preemptions
again. However, we've seen some cases in AWS where the post test
check returns no preemptions despite the polling returning
preemptions.

This may be just be the AWS check being eventually consistent,
so we want to avoid posting if either check finds preemptions.
This change resets failures in the case of a vm preemption,
in case a timeout occurred which normally takes precedence
over all other failures. While a timeout suggests that
something should be fixed with the test (usually respecting
the test context cancellation), we see that in practice,
engineers tend to close the issue without investigating as
soon as they see the preemption.

This also removes the potential duplicate vm_preemption
failure that may have been added by the preemption polling.
@DarrylWong DarrylWong requested a review from a team as a code owner February 11, 2025 15:12
@DarrylWong DarrylWong requested review from herkolategan and srosenberg and removed request for a team February 11, 2025 15:12
Copy link

blathers-crl bot commented Feb 11, 2025

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Backports should only be created for serious
    issues
    or test-only changes.
  • Backports should not break backwards-compatibility.
  • Backports should change as little code as possible.
  • Backports should not change on-disk formats or node communication protocols.
  • Backports should not add new functionality (except as defined
    here).
  • Backports must not add, edit, or otherwise modify cluster versions; or add version gates.
  • All backports must be reviewed by the owning areas TL. For more information as to how that review should be conducted, please consult the backport
    policy
    .
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters. State changes must be further protected such that nodes running old binaries will not be negatively impacted by the new state (with a mixed version test added).
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.
  • Your backport must be accompanied by a post to the appropriate Slack
    channel (#db-backports-point-releases or #db-backports-XX-X-release) for awareness and discussion.

Also, please add a brief release justification to the body of your PR to justify this
backport.

@blathers-crl blathers-crl bot added the backport Label PR's that are backports to older release branches label Feb 11, 2025
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@DarrylWong DarrylWong merged commit d65b9f2 into cockroachdb:release-24.3 Feb 11, 2025
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Label PR's that are backports to older release branches
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants