-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix XUnit Log Fixer Irregularities #85183
Conversation
…as well for Helix to process it.
Tagging subscribers to this area: @hoyosjs Issue DetailsFixes #84940 and #85056. This PR addresses some irregularities with the XUnit Log Fixer that were causing some CI runs to not give the appropriate result.
|
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run jit-cfg |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Ivan for fixing this! I'm not sure about the HW intrinsics failures on arm, they don't seem related to your change but they might be worth retrying or checking other recent PR runs to see whether they're ambient or somehow indirectly caused by the slightly changed test execution logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment/question
@@ -97,7 +104,7 @@ static int Main(string[] args) | |||
{ | |||
Console.WriteLine("[XUnitLogChecker]: An error occurred. No stats csv" | |||
+ $" was found. The expected name would be '{statsCsvPath}'."); | |||
return FAILURE; | |||
return TestExitCode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for the test to return success, but the XUnitLogChecker to fail (i.e., places where it previously returned FAILURE
before this PR)? In that case, should the CI consider that a failure? Or do you want various XUnitLogChecker failures to be ignored by the CI? If you want XUnitLogChecker failures to be shown as a CI failure, then the return FAILURE
cases here should be preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Bruce, that's a reasonable suggestion; in practice I think most of these special cases only occur when the test crashes halfway through and then it also returns an error exit code but I believe your suggestion is a general goodness for robustness of the infra.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like a reasonable suggestion to me, combined with Juan's in the next comment. I'll implement it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion that may decouple the fixer.
@@ -405,8 +405,8 @@ | |||
<XUnitLogCheckerHelixPath>$(XUnitLogCheckerHelixPath)XUnitLogChecker/</XUnitLogCheckerHelixPath> | |||
|
|||
<XUnitLogCheckerArgs>$(_MergedWrapperRunScriptDirectoryRelative) $(_MergedWrapperName)</XUnitLogCheckerArgs> | |||
<XUnitLogCheckerArgs Condition="'$(TestWrapperTargetsWindows)' != 'true'">$(XUnitLogCheckerArgs) %24HELIX_DUMP_FOLDER</XUnitLogCheckerArgs> | |||
<XUnitLogCheckerArgs Condition="'$(TestWrapperTargetsWindows)' == 'true'">$(XUnitLogCheckerArgs) %25HELIX_DUMP_FOLDER%25</XUnitLogCheckerArgs> | |||
<XUnitLogCheckerArgs Condition="'$(TestWrapperTargetsWindows)' != 'true'">$(XUnitLogCheckerArgs) %24%3F %24HELIX_DUMP_FOLDER</XUnitLogCheckerArgs> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this better be in the helix command? As in save the result, not pass it in, and then exit with that if non-zero or with the fixer's exit otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this. Something like:
// run test
set _saved_errorlevel=%errorlevel%
// run XunitLogChecker
if %errorlevel% NEQ 0 exit 1
if %_saved_errorlevel% NEQ 0 exit 1
exit 0
?
It's unfortunate that building a script in this helixpublishwitharcade.proj is comically complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea. I just have one question:
Wouldn't it be better to make it like this:
# Run test
test_exit_code=$?
# Run XUnitLogChecker
if [ $? -ne 0 ]; then exit 1; fi
if [ $test_exit_code -ne 0 ]; then exit $test_exit_code; fi
exit 0
I ask because as far as I understood the test infra, we would rather return the test's exit code on failure, rather than always '1'. Well, this assuming the log fixer finishes successfully of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me.
Another question is whether we should preferentially return the test failing code, which would require saving the XUnitLogChecker code:
# Run test
test_exit_code=$?
# Run XUnitLogChecker
xunitlogchecker_exit_code=$?
# If the test failed, return that exit code
if [ $test_exit_code -ne 0 ]; then exit $test_exit_code; fi
if [ $xunitlogchecker_exit_code -ne 0 ]; then exit $xunitlogchecker_exit_code; fi
exit 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the best way to proceed. Thanks for your observations Bruce.
There's some failures due to missing logs from test assemblies that are skipped for a given platform. There's also some x86/x64 hwintrinsic failures that should all be resolved in #85281. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how it looks now.
One more question:
In #85056, I see:
1259/2363 tests run.
* 1259 tests passed.
* 0 tests failed.
* 0 tests skipped.
If I see those numbers, passed + failed + skipped doesn't add up to tests run. Should that be considered a failure, and if so, should the XUnitLogChecker return an error code in that case? (Perhaps the test already did or should have returned an error code in that case?)
Oh this might need some addressing, considering you're bringing up the question. The log checker here is just reporting the status until the crash. When the test process was terminated, 1259 tests had passed, and no others had been skipped or failed gracefully. Maybe we could add in a subsequent PR a fourth field called "tests missing"? We could have the log checker subtract passed+failed+skipped from the total expected for a more complete report. |
If you pull in latest main, then the x64 hwIntrinsic tests should now pass. You may still have to handle the other case I called out above of. That is, the |
Thanks Tanner. I will pull the latest main changes and be on the lookout for the other potential issue. |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
@tannergooding The logs are now seeing, e.g.:
So the XUnitLogChecker changes in this PR appear to be working correctly. However, presumably these failures were masked/hidden before, and now they are exposed. It doesn't appear to be the job of the XUnitLogChecker to fix this problem. How should this get fixed? Someone is trying to execute a .cmd file that doesn't exist, and is actually expected to not exist (in this case). |
This might be due to a corner case where we don't know in advance that all tests in a particular merged wrapped will end up getting skipped e.g. due to their incompatiblity with GC stress or a given targeting architecture. Overall I would expect these cases to be rare and not particularly sensitive w.r.t. Helix performance. At the very least we should fix them so that they don't cause such spurious errors, ideally we should avoid sending them to Helix altogether (albeit I doubt it's generally doable in the presence of tests requiring process isolation as the merged wrapper logic simply has no way to know what the execution script contains). |
On second thought, the corner case I described shouldn't manifest as a missing merged test execution script (cmd / sh) so I guess there must be something else at work here. |
[edited to hopefully improve readability] Conjecture - we produce the marker files (.MergedTestAssembly / .MergedTestAsembly.x.y.MergedTestAssemblyForStress) because the build (for managed components) is AnyCPU/etc, but then we produce all .cmd files in separate builds on the actual test targets. Since HardwareIntrinsics_Arm_r is architecture-specific (we have conditional GCStressIncompatible and CLRTestTargetUnsupported on the merged group itself) it does not generate a .cmd but at this point we're already in the helix job that will have nothing more to do. |
Thanks Mark, that sounds very reasonable. If that's indeed the case, we should be able to easily filter these out somewhere around
by detecting that we don't have the equivalent execution scripts. |
Am I right in assuming this gets executed at some point after the sh/cmd scripts have been generated? If yes, then it might not be too difficult to address in a follow-up PR. |
I think the failure of missing .cmd files needs to be fixed before this PR can merge, or else there will be lots of Pri-0/outerloop test failures. |
On second thought, it might not be that good if merging this would break all those legs while we fix that other problem. What are your thoughts everyone? |
Sounds reasonable to me. Let me know if you or someone in your team will do that Bruce. If not, then I can take on that work item. |
My initial impression is that this should be a one-line change in the script |
I wrote "but then we produce all .cmd files in separate builds on the actual test targets", but I can't find when this actually happens. In run-test-job, there are 3 (Please don't let me distract you from an actual fix here, which I think you'll be able to do much faster than me, as I'm not completely following this part of the discussion. Thanks for your help!) |
I believe that from the Helix viewpoint the merged test wrappers are just ordinary runtime tests so that the logic for generating their own cmd / sh execution scripts should mostly follow the normal logic around |
Thanks a lot for your help Tomas! Let me know if there's anything you need my help with. |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Test run looks (basically) clean |
Fixes #84940 and #85056. This PR addresses some irregularities with the XUnit Log Fixer that were causing some CI runs to not give the appropriate result.
<stack-trace>
tag. It did not know tags could also have hyphens as part of their label. The regex in charge of this was adjusted accordingly, and now it recognizes all of them seamlessly.