Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplifying the lazy formula in evaluate #4868

Closed
wants to merge 1 commit into from

Conversation

FauziAkram
Copy link
Contributor

@FauziAkram FauziAkram commented Nov 9, 2023

Simplifying the lazy formula in evaluate, by removing one multiplication operation.

Passed STC:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 44320 W: 11342 L: 11132 D: 21846
Ptnml(0-2): 163, 4947, 11731, 5155, 164
https://tests.stockfishchess.org/tests/view/654ab7f0136acbc57352aba7

Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 53268 W: 13292 L: 13102 D: 26874
Ptnml(0-2): 35, 5908, 14556, 6102, 33
https://tests.stockfishchess.org/tests/view/654b6863136acbc57352b9c3

Passed also the test with adj=OFF:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 91776 W: 23436 L: 23276 D: 45064
Ptnml(0-2): 344, 10619, 23818, 10747, 360
https://tests.stockfishchess.org/tests/view/654e31c5136acbc57352f2e8

bench: 1484225

bench: 1484225
@FauziAkram
Copy link
Contributor Author

@snicolet Can you please give us your thoughts about this simplification?

@locutus2
Copy link
Member

Also i have started a test against this PR which completly remove the shuffling part.
STC passed here https://tests.stockfishchess.org/tests/view/654cfec8136acbc57352d9c1
LTC runs here https://tests.stockfishchess.org/tests/view/654db061136acbc57352e7d5

@vdbergh
Copy link
Contributor

vdbergh commented Nov 10, 2023

I wonder if this should not be tested with adjudication off.

@XInTheDark
Copy link
Contributor

Yes testing with adjudication off would be quite useful in this instance.

@locutus2
Copy link
Member

@vdbergh
That seems a good idea. I stopped my LTC of the complete removal.

@snicolet
Copy link
Member

snicolet commented Nov 11, 2023

@snicolet Can you please give us your thoughts about this simplification?

We introduced this term to be 100% sure that Stockfish would never enter a shuffling sub-branch without progress during search because of SimpleEval, but do whatever you want if you want to remove that :-)

@FauziAkram
Copy link
Contributor Author

The test with adj=OFF is underway:
https://tests.stockfishchess.org/tests/view/654e31c5136acbc57352f2e8

@vondele in case also this test passes, is an LTC test with adj=OFF needed?

@FauziAkram
Copy link
Contributor Author

Passed also the test with adj=OFF:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 91776 W: 23436 L: 23276 D: 45064
Ptnml(0-2): 344, 10619, 23818, 10747, 360
https://tests.stockfishchess.org/tests/view/654e31c5136acbc57352f2e8

@Vizvezdenec
Copy link
Contributor

I don't really like this stuff because indeed this part of code was never to gain elo but rather to not evaluate the same stuck position with lazy eval into oblivion.
I would prefer such things to be tested with elo gain bounds like verification search removal and stuff.

@vdbergh
Copy link
Contributor

vdbergh commented Nov 13, 2023

@Vizvezdenec I agree in principle. But there should be some evidence that this extra term really works.

It is trivial to show that removal of verification search is not good since it makes SF blind to zugzwang. One would like to see similar evidence for the shuffling term, before including it.

@peregrineshahin
Copy link
Contributor

peregrineshahin commented Nov 13, 2023

It is trivial to show that removal of verification search is not good since it makes SF blind to zugzwang

I don't think that's true at all.

@peregrineshahin
Copy link
Contributor

peregrineshahin commented Nov 13, 2023

I don't think that's true at all.

In the sense that it is not trivial. We don't have a script to test. and we don't have the data. so basically there is no actual procedure, to verify that one would go to the most obscure commits for pb0, and still the data and conclusions are very shaky.

@peregrineshahin
Copy link
Contributor

The positions were tested using Arena GUI not even a Python script, Who on the earth uses Arena to test positions for development?

@vdbergh
Copy link
Contributor

vdbergh commented Nov 13, 2023

I don't think that's true at all.

In the sense that it is not trivial. We don't have a script to test. and we don't have the data. so basically there is no actual procedure, to verify that one would go to the most obscure commits for pb0, and still the data and conclusions are very shaky.

Well take any position whose solution depends on zugzwang and SF will not be able to solve it without verification search. So yes it is trivial. This is the reason why in the past attempts to remove verification search were quickly reverted after users started complaining.

But I agree that it would be good to have an explicit list of zugzwang positions which can be used to validate changes in nulmove pruning.

@Vizvezdenec
Copy link
Contributor

Vizvezdenec commented Nov 13, 2023

I don't think that's true at all.

In the sense that it is not trivial. We don't have a script to test. and we don't have the data. so basically there is no actual procedure, to verify that one would go to the most obscure commits for pb0, and still the data and conclusions are very shaky.

Well take any position whose solution depends on zugzwang and SF will not be able to solve it without verification search. So yes it is trivial. This is the reason why in the past attempts to remove verification search were quickly reverted after users started complaining.

But I agree that it would be good to have an explicit list of zugzwang positions which can be used to validate changes in nulmove pruning.

This is not what actually happens. Stockfish is able to search quite a lot of positions that depend on zugzwang even without any verification search whatsoever.
Same goes for other engines that actually don't have verification search naturally.

@vdbergh
Copy link
Contributor

vdbergh commented Nov 13, 2023

I guess it depends on what you call "quite a lot". It's been a long time since I looked at this but I recall that SF was unable to do any endgame problem (even trivial ones) whose solution depends on zugzwang with verification search disabled. Perhaps this has changed but I would be surprised.

@vondele
Copy link
Member

vondele commented Nov 13, 2023

so, let's leave null moves out of the picture. The simpleEval threshold appears to have right now 3 kinds of safeguards, ensuring it is not used in winning positions (the addition of bestValue), it is not used when the root position is imbalanced (the additional of simpleEval for the root), as well as the shuffling term.

Do we have (or can construct) any position right now where complete removal (not this scaling down) of the shuffling term leads to a wrong evaluation? Probably a balanced draw position where an Q-capture leads to a draw fortress could do ?

@FauziAkram
Copy link
Contributor Author

Ok, to proceed with this PR, launched the simplification removing the whole shuffling term altogether, and testing with adj OFF
https://tests.stockfishchess.org/tests/view/655b7544136acbc573540dfd

@Disservin Disservin added the bench-change Changes the bench label Nov 20, 2023
@FauziAkram
Copy link
Contributor Author

Also the complete removal of shuffling, with adj OFF passed the test:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 120928 W: 30658 L: 30530 D: 59740
Ptnml(0-2): 423, 14058, 31375, 14184, 424
https://tests.stockfishchess.org/tests/view/655b7544136acbc573540dfd

And since no one presented any position where this change leads to a wrong evaluation, maybe we can proceed in merging.

@Disservin
Copy link
Member

Outdated, code no longer exists after dual net merge, so closing.

@Disservin Disservin closed this Jan 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bench-change Changes the bench
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants