-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement check-result: test check failure affects test result by default #3239
base: main
Are you sure you want to change the base?
Conversation
345eced
to
6ace91c
Compare
Rebased to @happz's result-store-original-result #3147 and tried to use beakerlib for the first time. Feels werid to do all the assertgreps, but I've been merely following the existing examples. Still proof of concept, but at least I finally figured that the "after-test" check were not available without the change in execute/internal.py. |
@martinhoyer if you change the base branch to the one from #3147, the diff should reduce a bit. |
tmt/result.py
Outdated
checks_failed = any(check.result == ResultOutcome.FAIL for check in self.check) | ||
|
||
if interpret == ResultInterpret.RESPECT: | ||
if self.respect_checks and checks_failed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
respect_check
feels more like an input, IIUIC it's basically interpret
, but in respect to checks. So, they should enter the method through the same door, so to speak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so you want me to create
class CheckResults(enum.Enum):
RESPECT = 'respect'
IGNORE = 'ignore'
...
?
I'm so lost in all the schemas, specs, et al.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we could re-use ResultInterpret
, it has the same purpose, and skip
can serve as ignore
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this help?
return _result.interpret_result(
ResultInterpret(invocation.test.result) if invocation.test.result else ResultInterpret.RESPECT,
ResultInterpret(invocation.test.check_result) if invocation.test.check_result else ResultInterpret.RESPECT
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm so lost in all the schemas, specs, et al.
It shouldn't be that messy. If I can help somehow, let me know, maybe the whole area needs more documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, something like this, with one exception: I don't understand why Result.check_result
exists. How checks should be interpreted is a test-level metadata key, i.e. it should come from the test, not the result. Isn't it the same story as Test.result
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Honestly that's a leftover from the previous attempts to make things work that I missed and kept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the light of #3185 (comment), no check-result
on test level, I suppose...
This means new result
check-level field would reside in tmt.checks.Check
class (and _RawCheck
), and interpretation of check results need to match a result with corresponding check in test.check
list...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworked.
perplexing |
d7e7066
to
ea54015
Compare
ef6b6d6
to
01a3246
Compare
tmt/result.py
Outdated
|
||
# Handle individual check results | ||
for check in self.check: | ||
check_result = ResultInterpret(check.result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field should already be ResultInterpret
, as mentioned in a comment above WRT its definition. And I just noticed that Test.result
isn't an enum, which is definitely incorrect, and a bad example :/
tmt/checks/__init__.py
Outdated
default='respect', | ||
serialize=lambda result: result.value if isinstance(result, enum.Enum) else str(result), | ||
unserialize=lambda spec: spec, | ||
help='How to interpret the check result.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can switch from Any
to ResultInterpret
, if we add the from_spec()
method to it, unserialize
could use it. Might need also normalize
callback, but I'd pay that price for getting rid of Any
. With Any
, no user of this field can't be sure what the value is.
tmt/result.py
Outdated
if interpret == ResultInterpret.XFAIL: | ||
if self.result == ResultOutcome.PASS and checks_failed: | ||
self.result = ResultOutcome.FAIL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@psss I got a little entangled in the xfail logic and the # Handle individual check results
part should probably be done before this.
I wonder if it will be digestible enough for the users, I mean:
xfail test fails, check with default 'respect' fails = FAIL
i.e. users would have to add xfail also to check, if expected. Could be breaking change for some?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. I would also like to petition for the following scenario.
if the before test portion is the only part that fails for the check, the test should be self.result = ResultOutcome.WARN
.
spec/tests/check.fmf
Outdated
The `result` key can be used to specify how the check result should be | ||
interpreted. It accepts three options: | ||
- 'respect' (default): The check result affects the overall test result. | ||
- 'skip': The check result does not affect the overall test result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we're introducing a new value skip
which is currently not present in the Test.result
key specification. What about keeping the consistency here as well and support pass, info, warn, error, fail
in the very same way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm,skip
is definitely one of the values of https://github.com/teemtee/tmt/blob/main/tmt/result.py#L43, and it's also allowed by schema, https://github.com/teemtee/tmt/blob/main/tmt/schemas/common.yaml#L481. If it's not supposed to be supported as a result
value, we should remove it from these places as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although, I'd argue the specification should be updated instead, because skip
is a perfectly valid test outcome, like pass
or fail
, and it makes sense that one might want to enforce it as a test outcome via result
, just like one might want to enforce pass
or xfail
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And +1 to accepting all keys allowed by ResultInterpret
, i.e. why allowing skip
and not xfail
or pass
. Except custom
, that one does not make much sense for a test check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And restraint
, restraint
is also useless at this point of code :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I got pretty frustrated with this code. I guess I'm too dumb for mixing custom mixins, uninheritable classes, specifications, schemas, et al. I'm sure there are good reasons for all of this, but idk, somehow I think doing a relatively small change should be easier.
That's sad to hear. I believe that while it might be a relatively small change, it's not a small change given the interaction of test and test check results. Schema, specification, the fact that enum inheritance is limited, well, that's a fact we can't bypass.
Anyway, should I create some BaseResultInterpretMixin class w/ SerializableContainer?
I for one don't think so. Where do you think it would be needed?
I tried to went back to using 'ignore', but now we're mixing two things together, results and how to interpret it. :(
We can get on a call and try to get through the parts that cause you trouble, if you want to. Mixing the result and its interpretation is definitely a sign of a wrong turn somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, check_result = CheckResultInterpret(check.result.value)
is definitely not the right move, check.result.value
is the outcome, the interpretation must come from the check, not from its result. That's also true for test, and you see how test result interpretation gets in, via invocation.test.result
in _result.interpret_result(invocation.test.result)
call - but we do not have the same way for test checks, and as I was saying, their interpretation needs to be given to interpret_result()
, as soon as you try to extract it from a result, you're on the wrong track.
I don't want to start rewriting the patch for you, but you obviously do not enjoy the coding at this moment, if it would help, we can sit on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I for one don't think so. Where do you think it would be needed?
It's just that the ResultsInterpret is not exactly DRY.
Only thing that came to mind, except for using str literals instead of enums.
We can get on a call and try to get through the parts that cause you trouble, if you want to.
Thanks, much appreciated.
It seems like we strayed a bit too far from the original goal and while we have check result keys, it's used the same way as test result and I'm not sure if it is the best way to let users define that the check result should be "ignored".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to start rewriting the patch for you, but you obviously do not enjoy the coding at this moment, if it would help, we can sit on it.
Thanks @happz, I'll get to it with a pair of fresh eyes later. Sorry for letting my depression leak to the comments here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, as you wish. Just a couple of notes:
- something like this could work for propagating how check results should be interpreted -
check_result = CheckResultInterpret(check.result.value)
is definitely not the right way how to get this info.- return _result.interpret_result(invocation.test.result) + return _result.interpret_result( + invocation.test.result, + { + check.how: check.result + for check in invocation.test.check + } + ) def interpret_result( self, - interpret: ResultInterpret) -> 'Result': + interpret_test: ResultInterpret, + interpret_checks: dict[str, CheckResultInterpret]) -> 'Result':
- use a helper function for setting a note, the repetition of
if
/else
consumes 4 lines for every single note, a helper function would take care of', ...'
and condition, greatly improving the readability of code. - this looks like a mistake, check result propagated into result - is that correct? We probably don't want check result to be promoted to main result, instead failed results would turn the main result into a failed one - a skipped check does not mean main result should turn to
SKIP
elif interpret_check in [CheckResultInterpret.PASS, CheckResultInterpret.FAIL, CheckResultInterpret.INFO, CheckResultInterpret.WARN, CheckResultInterpret.ERROR, CheckResultInterpret.SKIP]: self.result = ResultOutcome(check.value)
- the order should be probably reversed: first we need to sort out check results, because they might be
PASS
, but theirresult
key might be set tofail
, turning them to failed checks -any(check.result == ResultOutcome.FAIL for check in self.check)
will miss checks that would turn into failed checks because of their interpretation.
a476e34
to
f91ea05
Compare
tmt/result.py
Outdated
self.note += ', check failed' | ||
else: | ||
self.note = 'check failed' | ||
self._set_note('check failed') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch does look superfluous to me: check.result
should be changed, not self.result
- at this point, we're deciding the fate of a single check result, and the test result is beyond the horizon. The effect of this check result on the test result will be evaluated later. Which means, the outcome of the test did not change in any way, therefore no need for a note -> no need to check for RESPECT
+ FAIL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense
tmt/result.py
Outdated
else: | ||
self.note = f'check result: {check.result.value}' | ||
# We don't promote check result to main result, but failed checks turn the | ||
# main result to FAIL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't know that yet: if the test has result: info
, it does not matter what is the outcome of checks, they should never override the main test result, because the user clearly said "skip" is the desired and final value. We still need to resolve them, to record their final outcomes though, therefore we can't skip check results when result: skip
is set.
I have a feeling you are getting too entangled in the problem. My advice would be to ignore the other part of the problem: focus on the interpretation of one single check result, and completely ignore what it might do to the test result. Go over all check results, and interpret every single one of them as needed. Ignore whatever their effect might be on the main result. Only after that, take self.result
and interpret_test
, collect the final check results, and resolve their interaction. If test.result
is SKIP
, PASS
, etc., set self.result
and exit
. If it's RESPECT
, compute the effect of check results, switch self.result
to FAIL
or WARN
, exit. And so on. We are not in the business of saving CPU cycles, make it correct first, and then we can make it fast by introducing shortcuts. We can afford crude and readable code :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe even one more: how about ignoring this method for now, and instead constructing a simple method that would focus on the interpretation only. Feed it with ResultOutcome
and ResultInterpret
for the test, a map of check name/ResultOutcome
and a map of check name/CheckResultInterpret
, and let it decide the final test result, final check results, and optionally a note. Add a parametrized unit test for it for various combinations. Do not feed it with Result
or CheckResult
or any structures, just enums and strings. Get that right, get it working, then we can discuss just how this process works, whether e.g. SKIP
has the desired effect. No worrying about other structures. Once we are happy, translate it to this method, i.e. let it write decisions into structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again. I'll get to it tomorrow.
Once we arr
🏴☠️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll get to it tomorrow.
hmm, looks like a busy weekend. Monday it is.
8a0c169
to
c419c5c
Compare
Trying to implement the check results as discussed yesterday. Started from scratch. Still don't understand specs, schemas and docs generation. Why can't we just have a one place to define things at? |
@@ -0,0 +1,28 @@ | |||
import enum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why keep them in a standalone file? __init__.py
has about 250+ lines, isn't that small enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just needed to deal with that circular import somehow. Would gladly do a less bad solution.
@@ -189,6 +189,27 @@ class CheckResult(BaseResult): | |||
default=CheckEvent.BEFORE_TEST, | |||
serialize=lambda event: event.value, | |||
unserialize=CheckEvent.from_spec) | |||
check_result: CheckResultInterpret = field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result class cannot contain a field that governs its own interpretation. That is owned by a different entity: Test.result
defines how Result
should be interpreted, similarly Check.result
defines how CheckResult
should be interpreted. This field should not exist, check result simply cannot decide itself, it must be told how to interpret the value.
def interpret_check_result(self, interpret: CheckResultInterpret) -> 'CheckResult':
"""Interpret check result according to the check_result interpretation."""
if interpret == CheckResultInterpret.RESPECT:
return self
if interpret == CheckResultInterpret.INFO:
self.result = ResultOutcome.INFO
elif interpret == CheckResultInterpret.XFAIL:
self.result = {
ResultOutcome.FAIL: ResultOutcome.PASS,
ResultOutcome.PASS: ResultOutcome.FAIL
}.get(self.result, self.result)
return self
Take a CheckResult
instance, find the corresponding Check
, most likely from the invocation.test.check
field, and pass its result
to CheckResult.interpret_check_result()
. Recall the mapping I proposed last week to be put into Result.interpret_result()
:
{
check.how: check.result
for check in invocation.test.check
}
It's the same solution, the idea is there: the interpretation does not lie in the result itself, it's owned by test (for test results) or test check (for test check results), and must be extracted from them (which is exactly what Result.from_test_invocation()
does, return _result.interpret_result(invocation.test.result)
).
@@ -347,6 +368,21 @@ def interpret_result(self, interpret: ResultInterpret) -> 'Result': | |||
raise tmt.utils.SpecificationError( | |||
f"Test result note '{self.note}' must be a string.") | |||
|
|||
# Interpret check results | |||
failed_checks: list[CheckResult] = [] | |||
for check in self.check: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, for check_result in self.check
- check
might be very misleading, especially when you do need to access actual checks and their result
keys.
interpreted_check.result == ResultOutcome.FAIL and | ||
interpreted_check.check_result != CheckResultInterpret.INFO | ||
): | ||
failed_checks.append(interpreted_check) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need a bit more work, because self.check
is left untouched: interpretation happens, but interpreted_check
is then thrown away, it's not stored anywhere. It's basically swapping one list for another, simple list comprehension. Together with the necessity of accessing checks to get their result
key, it might be something like the following:
# This is constructed in `from_test_invocation()` and fed to `interpret_result()` - just like the result interpretation:
{
check.how: check.result
for check in invocation.test.check
}
def interpret_result(self, interpret: ResultInterpret, interpret_checks: dict[str, CheckResultInterpret]) -> 'Result':
...
self.check = [
check_result.interpret_check_result(interpret_checks[check_result.name])
for check_result in self.check
]
failed_checks = [
check_result
for check_result in self.check
if check_result.outcome == ResultOutcome.FAIL]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. Needs work, yes. Though I might need someone to go through the code with me, because most of the time I'm just guessing what is supposed to do what and how and then trying a blind trial-error. Not enjoyable and I'm back to feeling like an useless idiot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I did offer to have an online session, and the offer still stands. Add something to my calendar and we can try it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome, much appreciated.
We can. It somehow started separately, in an organic way, as projects grow, and the amount of work necessary to turn the tide and introduce a single source of truth is not trivial. The work is ongoing, and one possible answer might be that I, who also dislike the currently fractured state of things, am unable to change it all in one night. |
Right, but isn't that a sunk cost fallacy? I'm not saying you should do it alone in one night, but perhaps have a defined "state" to strive for and think about whether it wouldn't be worth working towards it in a long run. |
Yeah, that was just an example. There is a long-term goal, it just takes so many small steps to get there. Just recently plugin docs began to be generated from plugin sources. It's useful on its own, but it also served as an experiment on what would we need to render plugin schema from plugin sources. And so on, we're moving toward a much simpler state of things, with fewer sources of truth, it's just very slow. |
Trying to address #3185
Pull Request Checklist