Verify base signature before new #1893

peregrineshahin · 2024-02-16T15:05:07Z

so that we know if master passed

ppigazzini · 2024-03-03T15:30:38Z

I don't grasp what is the improvement introduced by this change:

the worker already checks the signature of both "base" and "test"
"base" and "test" are not bound to be set to the "master" branch, even if that is the usual use case on fishtest

peregrineshahin · 2024-03-03T15:33:42Z

Are you sure the worker checks both?
suppose the first verify raised an error does the worker continue to verify master?
If so you can close the PR, but at that time looking at the log with the exit failure,, I didn't see a verification happening for master.

The goal was to verify that it's not master that is bugged with unstable bench or exit, since the new patch is based on master most of the time.

ppigazzini · 2024-03-03T15:41:41Z

Either with master or PR code the worker raises an Exception if the bench is wrong. The "base" usually is master, with the bench set automatically by the server, while the bench of "test" is written by the user and has a probability of being wrong. So, failing fast verifying first the "test" bench is a better practice IMO.

peregrineshahin · 2024-03-03T15:44:28Z

https://tests.stockfishchess.org/actions?action=&user=&text=EXIT_FAILURE
Does these include any master commit?
See here no EXIT_FAILURE for master that left us clueless, you need real case senario here is the real world scenario, the goal wasn't cosmatics

peregrineshahin · 2024-03-03T15:48:47Z

if as you say these massages does include master commits which I'm lazy to check since it requires looking back to master commits SHA, then I don't see the point of my PR.
i.e. if really we verify both nevertheless

ppigazzini · 2024-03-03T15:49:14Z

I suppose that was due to workers without wget/curl, unable to download the net(s) required by master through the makefile.
And in any case, the worker verified the bench of "base:master" and raised that exception.

peregrineshahin · 2024-03-03T15:54:00Z

but still, teach me why it continues to verify master

is it because RunException, and WorkerExcpetion continues the next code normally?

peregrineshahin · 2024-03-03T15:54:54Z

    if p.returncode != 0:
        if p.returncode == 1:  # EXIT_FAILURE
            raise RunException(
                "Bench of {} exited with EXIT_FAILURE".format(os.path.basename(engine))
            )
        else:  # Signal? It could be user generated so be careful.
            raise WorkerException(
                "Bench of {} exited with error code {}".format(
                    os.path.basename(engine), format_return_code(p.returncode)
                )
            )

ppigazzini · 2024-03-03T15:55:33Z

The code verifies the benches of the branch of "test" and the branch of "base", on fishtest we can schedule a test "sf_11" vs "sf_12". No code is verifying expressly the bench of "master", neither your proposed change.

peregrineshahin · 2024-03-03T15:56:26Z

I never said that it verifies master, it's just the most probable case

Disservin · 2024-03-03T15:56:33Z

I suppose that was due to workers without wget/curl, unable to download the net(s) required by master through the makefile.

I don't think it was, I remember checking a workers log and there was no error from make saying that the download failed, it (iirc) even said it downloaded it successfully. My guess at that time was the net was deleted after the download from the makefile. I've attached the log from techno at that time, if you want to take a look.

fishtest-client.zip

peregrineshahin · 2024-03-03T15:57:34Z

Explain how you expect the worker to verify both
when RunException is triggered

ppigazzini · 2024-03-03T15:58:19Z

Explain how you expect the worker to verify both when RunException is triggered

Explain you how your code is verifying "master", please.

peregrineshahin · 2024-03-03T16:02:08Z

Explain you how your code is verifying "master", please.

If we have the verify of base first, which in most of the time is the master, and even if not it was master it's most probably an older version of the new base
we can assume if the verify of base is done first and no error is raised but the error is raised on the new afterwards then we know as Stockfish developers that master or base is clean and the oroblem is in the new.

Why you took the master so religously?

peregrineshahin · 2024-03-03T16:03:31Z

And what's hard about that to imagine?
Just say that it verifies both
if it doesn't verify both then you are the clueless one and relgously netpicking the master word

ppigazzini · 2024-03-03T16:05:06Z

Are you hinting that devs always set a wrong bench for "new", skipping the verification of "base"?
This log is the proof that the bench of "base" is verified.
https://tests.stockfishchess.org/actions?action=&user=&text=EXIT_FAILURE

peregrineshahin · 2024-03-03T16:06:13Z

It's not about devs putting the wrong bench
it's about the binary itself could hold unstable bench because of undefined behavior

peregrineshahin · 2024-03-03T16:10:14Z

once in a while we have this problem of undefined behavior if the base does not get verified before new, then we are cluless.
If it doesn't verify both

peregrineshahin · 2024-03-03T16:12:53Z

At any rate arguing with you is useless at this point.

ppigazzini · 2024-03-03T16:17:57Z

Show me the code in this PR that verifies the benches of both "base" and "test" before raising the exception.
Check here how I collected all the missing nets before raising the exception.
#1902

peregrineshahin · 2024-03-03T16:20:05Z

That wasnt the goal.. the goal is telling SF devs that we in fishtest code do verify base before new.. thus with this code knowledge we can tell other people asking if their undefined behavior is from the patch or base that fishtest code if it was base that is bugged then you know it's base and noth their patch that is bugged.

ppigazzini · 2024-03-03T16:20:10Z

I suppose that was due to workers without wget/curl, unable to download the net(s) required by master through the makefile.

I don't think it was, I remember checking a workers log and there was no error from make saying that the download failed, it (iirc) even said it downloaded it successfully. My guess at that time was the net was deleted after the download from the makefile. I've attached the log from techno at that time, if you want to take a look.

fishtest-client.zip

If there is the suspect of an instability of the bench of master, IMO is better to write a little script to loop on: building master and verifying the bench.

Disservin · 2024-03-03T16:22:09Z

Dunno how you came to that conclusion, in essence fishtest is already is kinda doing this because multiple workers run a patch. Back then I suspected that the fishtest worker during cleanup deleted the small net after it was downloaded, but that was just a guess

peregrineshahin · 2024-03-03T16:22:49Z

Do you get the idea now?
This PR provides the neccessary knowledge to tell the dev that their patch is bugged and not master.
I lost it at this point not gonnas argue further

peregrineshahin · 2024-03-03T16:24:15Z

And I emphasize on "master" ;)

ppigazzini · 2024-03-03T16:24:31Z

Dunno how you came to that conclusion, in essence fishtest is already is kinda doing this because multiple workers run a patch. Back then I suspected that the fishtest worker during cleanup deleted the small net after it was downloaded, but that was just a guess

Ask to him #1893 (comment)

ppigazzini · 2024-03-03T16:28:06Z

So master become master passing STC and LTC never failing the bench or on fishtest or in the CI, and then it starts failing the bench.

Disservin · 2024-03-03T16:36:20Z

So master become master passing STC and LTC never failing the bench or on fishtest or in the CI, and then it starts failing the bench.

Well the case was that suddenly tests were failing and new people might think that the test failed because of a change from them. Not everything is tested and sometimes stuff like compiler issues also pop up which are unrelated to the patch, while our CI and any previous test passed successfully.

So pere tried to make it obvious that master has failed first instead of the patch, to signal that there is perhaps a larger problem present. If the code actually does that I have no idea, I didn't check.

vdbergh · 2024-03-03T18:13:11Z

So master become master passing STC and LTC never failing the bench or on fishtest or in the CI, and then it starts failing the bench.

Well the case was that suddenly tests were failing and new people might think that the test failed because of a change from them. Not everything is tested and sometimes stuff like compiler issues also pop up which are unrelated to the patch, while our CI and any previous test passed successfully.

So pere tried to make it obvious that master has failed first instead of the patch, to signal that there is perhaps a larger problem present. If the code actually does that I have no idea, I didn't check.

I guess that makes sense.

ppigazzini · 2024-03-03T19:11:52Z

I guess that makes sense.

see here:
#1897

a short recap:

the first dual net test was submitted on fishtest without asking for fishtest proper changes to support it, crashing for days all the workers missingcurl/wget
the EXIT_FAILURE events started with the dual nets tests, and are stopped after adding the proper support to the worker and to the server for the multiple nets

With this PR in place, instead than the false flag of a wrong SF dev, we had people panicking for a big issue in SF master :)
We are already too good in chasing ghosts, instead that searching for the simple root cause: our fragile code or our bad habits.

As I already wrote few messages ago #1893 (comment) to have a clear view of the issue it's necessary to have both the verification of the benches, not only one, and only then raise the exception with all the information. I added a link to a recent PR to be taken as example.

peregrineshahin · 2024-03-03T19:19:47Z

I agree that my PR was lazy and adhoc that doesn't negate the fact that you were not listening to the purpose of the PR and not suggesting code changes.

Yes verifying both and then raise the error is better but that also addresses the goal

ppigazzini · 2024-03-03T19:27:54Z

I agree that my PR was lazy and adhoc that doesn't negate the fact that you were not listening to the purpose of the PR and not suggesting code changes.

Yes verifying both and then raise the error is better but that also addresses the goal

No, it simply starts the chase of the big ghost "master is bugged!" when the root cause is nearly 100% in fishtest code/process/worker.

I already linked a code that raises the exception after all the checks, feel free to take inspiration.

vdbergh · 2024-03-03T20:10:34Z

If the build of master suddenly starts failing then this is likely a problem in Fishtest (the idea behind this PR).

But: if all builds suddenly start failing (the case we are discussing) then that also indicates a likely problem in Fishtest....

So from this point of view this PR does not yield new information.

peregrineshahin · 2024-03-03T20:18:22Z

Well that's not idea of the PR, the idea of the PR is to verify if base has undefined behavior it's not about fishtest.
I wanted to verify that before we exit. but they haven't yet answered my question, does verifying the new patch that can result in a error stops the excution of next code, if yes then we don't know if the problem also exists in master, if no then the PR useless since we will verify both nevertheless.

vdbergh · 2024-03-03T20:22:18Z

Not sure if I understand your question. But with the current code (i.e. not this PR), if verifying the bench of New fails then a RunException is raised and no further verification is done (so base is not checked).

EDIT: assuming the failure is with exit code 1.

peregrineshahin · 2024-03-03T20:24:28Z

Yes, then what my PR does, is giving the benefit of the doubt that the code of the dev is not the bugged one, se we verify master first if it does verify and afterwards his new patch does not verfy then we can say to him your code is bugged and intoduces undefined behavior.
but now without this PR I cannnnnnnot telll anyone that their code is bugged.

vdbergh · 2024-03-03T20:34:11Z

But before master becomes master it has been built many times already. So a problem in master is unlikely and would be at most very intermittent. What we were seeing here is that every DEV build failed. If that were due to a problem in master, then it would mean that suddenly every build of master would also be failing. And as I explained this is unlikely.

Disservin · 2024-03-03T20:42:34Z

@vdbergh not sure which case you are discussing right now, but the not all builds failed for the aforementioned patches which resulted in EXIT_FAILURE. Some got to play the games and others didn't.

I see some benefit in easing this up since it's just a minor change and can potentially spot issues. I.e. a compiler miscompilation on very new compilers which only a small % of users have installed, will show us that it first failed for master, thus we have a bug in master.. vs user sees his test code fails for some workers, so they think they did something wrong change X resubmit and issue solved, never to be looked at again.

peregrineshahin · 2024-03-03T20:44:48Z

btw I don't care if this got merged or not, I was only pissed off of the attitude of dealing with it, the only thing I'm surprised about is how being an approver + sf contributor + fishtest contributor helps in being wiser.

Verify base signature before new

dc82f19

so that we know if master passed

peregrineshahin closed this Mar 3, 2024

peregrineshahin reopened this Mar 3, 2024

ppigazzini closed this Mar 3, 2024

peregrineshahin deleted the verify branch March 3, 2024 18:19

Verify base signature before new #1893

Verify base signature before new #1893

Conversation

peregrineshahin commented Feb 16, 2024

ppigazzini commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

ppigazzini commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

ppigazzini commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024 • edited Loading

ppigazzini commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

Disservin commented Mar 3, 2024 • edited Loading

peregrineshahin commented Mar 3, 2024

ppigazzini commented Mar 3, 2024 • edited Loading

peregrineshahin commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

ppigazzini commented Mar 3, 2024 • edited Loading

peregrineshahin commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

ppigazzini commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

ppigazzini commented Mar 3, 2024

Disservin commented Mar 3, 2024 • edited Loading

peregrineshahin commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

ppigazzini commented Mar 3, 2024

ppigazzini commented Mar 3, 2024

Disservin commented Mar 3, 2024 • edited Loading

vdbergh commented Mar 3, 2024

ppigazzini commented Mar 3, 2024 • edited Loading

peregrineshahin commented Mar 3, 2024

ppigazzini commented Mar 3, 2024

vdbergh commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024 • edited Loading

vdbergh commented Mar 3, 2024 • edited Loading

peregrineshahin commented Mar 3, 2024

vdbergh commented Mar 3, 2024 • edited Loading

Disservin commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024

peregrineshahin commented Mar 3, 2024 •

edited

Loading

Disservin commented Mar 3, 2024 •

edited

Loading

ppigazzini commented Mar 3, 2024 •

edited

Loading

ppigazzini commented Mar 3, 2024 •

edited

Loading

Disservin commented Mar 3, 2024 •

edited

Loading

Disservin commented Mar 3, 2024 •

edited

Loading

ppigazzini commented Mar 3, 2024 •

edited

Loading

peregrineshahin commented Mar 3, 2024 •

edited

Loading

vdbergh commented Mar 3, 2024 •

edited

Loading

vdbergh commented Mar 3, 2024 •

edited

Loading