Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pr700 pr750 #788

Closed
wants to merge 55 commits into from
Closed

Conversation

MelleKoning
Copy link
Contributor

@MelleKoning MelleKoning commented Mar 8, 2019

Note: not for merge.
This is a merge of pr700 certainty propagation and pr750 which ensures all root moves are always tried.
Just a test pull request to see if this would 'always' find mate in one moves.
The interest on this was because of an example CP game posted on lichess here:
https://lichess.org/i0Cxcq3y

As an example, after blacks move Kh6, white has a mate in #1 move available.
image

However, leela does not play it, it plays e7-e8 promotion to Queen.

The idea of this PR that includes certainty propagation (#700) and early-root-widening (#750) is to investigate if that version would find and play the mate in #1.

Testing can be done by taking the FEN position from li-chess and feeding that into lc0.

ensure to start lco with these options set to ON:
lc0 --early-root-widening=true --certainty-propagation --verbose-move-stats

or alternatively start up with mini-batchsize set to only 1:
lc0.exe --certainty-propagation=true --early-root-widening=true --verbose-move-stats -t 1 --minibatch-size=1 --weights=YOURTESTNET

and do the following commands within lc0

position fen 3R4/4PK2/7k/5P1p/p6P/P6P/8/8 w - - 2 77
go nodes 100`

to let leela think.

You should take the appveyor build in one of the links below to try this out

Videodr0me and others added 30 commits August 12, 2018 20:24
efficient propagation of certainty, two-fold draw scoring, mate display and more.
=1 suitable for training
=2 for play
Currently negabound search depth is one.
Improves play in positions with many certain positions (nrear endgame TBs, mates).
Sees repetitions faster and scores positions more accurately.
…ersion. Increasing threads (e.g. 4 or 6) will get to masters speed now. Further speed fixes (move generator) possible....
…with lto, this yields a speed up by 30-50% in backend=random. In order to fully use CP please use 4 threads+. Changed default temporarily to 4 threads with this commit, to collect more scaling data.
…ds instant play of certain winning moves and avoidance of loosing moves regardless of visits. CP=3 now adds advanced pruning.
- exposed depth parameter (0 is no-look-ahead)
- only two modes CP=1 for training and CP=2 for play
Todo:
- change option from int to choiceoption
- use info.mate to communicate mate scores
- Certainty Propagation is a bool option now, just on or off (default = off).
- Cleanup code and comments
- Threads default = 2, but if certainty propagation is turned on please use 4 threads.
@MelleKoning
Copy link
Contributor Author

Now that pr750 is updated and merged into master, this pr will be removed soon.

@MelleKoning
Copy link
Contributor Author

Another interesting position to test with CP is the mentioned promotion bug of some of the T40 nets in Discord channel.

Example fen position:
https://lichess.org/analysis/standard/4n3/p4P2/8/5k2/1Q6/4q3/1KP5/8_w_-_-

image

Playing this position with default leela results in a pawn capture promotion and stalemate position DRAW like this:

position fen 4n3/p4P2/8/5k2/1Q6/4q3/1KP5/8 w - - 0 1
go nodes 10
Loading weights file from: .\nets\41431
Creating backend [cudnn]...
GPU: Quadro M1000M
GPU memory: 2 Gb
GPU clock frequency: 1071.5 MHz
GPU compute capability: 5.0
CUDA Runtime version: 10.0.0
Cudnn version: 7.3.1
WARNING: CUDNN Runtime version mismatch, was compiled with version 7.4.1
Latest version of CUDA supported by the driver: 10.0.0
info depth 1 seldepth 2 time 123 nodes 2 score cp 1641 hashfull 0 nps 16 tbhits 0 pv f7e8q e3e8
info depth 2 seldepth 3 time 193 nodes 3 score cp 249 hashfull 0 nps 15 tbhits 0 pv f7e8q e3e8 b4c5
info depth 2 seldepth 4 time 263 nodes 5 score cp 109 hashfull 0 nps 19 tbhits 0 pv f7e8q e3e8 b4c5 e8e5
info depth 4 seldepth 5 time 331 nodes 13 score cp 47 hashfull 0 nps 39 tbhits 0 pv f7e8q e3e8 b4c5 e8e5 c5c3
info string b2a1  (218 ) N:       0 (+ 0) (P:  0.01%) (Q: -1.01849) (D:  0.000) (U: 0.00103) (Q+U: -1.01746) (V:  -.----)
info string b2b1  (219 ) N:       0 (+ 0) (P:  0.02%) (Q: -1.01849) (D:  0.000) (U: 0.00242) (Q+U: -1.01607) (V:  -.----)
info string b2a2  (222 ) N:       0 (+ 0) (P:  0.03%) (Q: -1.01849) (D:  0.000) (U: 0.00316) (Q+U: -1.01534) (V:  -.----)
info string b4g4  (692 ) N:       0 (+ 0) (P:  0.04%) (Q: -1.01849) (D:  0.000) (U: 0.00383) (Q+U: -1.01466) (V:  -.----)
info string b4h4  (693 ) N:       0 (+ 0) (P:  0.04%) (Q: -1.01849) (D:  0.000) (U: 0.00433) (Q+U: -1.01417) (V:  -.----)
info string b4f4  (691 ) N:       0 (+ 0) (P:  0.05%) (Q: -1.01849) (D:  0.000) (U: 0.00528) (Q+U: -1.01321) (V:  -.----)
info string b4e4  (690 ) N:       0 (+ 0) (P:  0.08%) (Q: -1.01849) (D:  0.000) (U: 0.00784) (Q+U: -1.01066) (V:  -.----)
info string b4e1  (678 ) N:       0 (+ 0) (P:  0.09%) (Q: -1.01849) (D:  0.000) (U: 0.00983) (Q+U: -1.00866) (V:  -.----)
info string b4d2  (682 ) N:       0 (+ 0) (P:  0.14%) (Q: -1.01849) (D:  0.000) (U: 0.01431) (Q+U: -1.00419) (V:  -.----)
info string b4e7  (703 ) N:       0 (+ 0) (P:  0.16%) (Q: -1.01849) (D:  0.000) (U: 0.01623) (Q+U: -1.00226) (V:  -.----)
info string b4f8  (705 ) N:       0 (+ 0) (P:  0.31%) (Q: -1.01849) (D:  0.000) (U: 0.03237) (Q+U: -0.98612) (V:  -.----)
info string f7e8n (1543) N:       0 (+ 0) (P:  0.36%) (Q: -1.01849) (D:  0.000) (U: 0.03738) (Q+U: -0.98112) (V:  -.----)
info string b4a5  (694 ) N:       0 (+ 0) (P:  0.44%) (Q: -1.01849) (D:  0.000) (U: 0.04531) (Q+U: -0.97318) (V:  -.----)
info string b4a4  (687 ) N:       0 (+ 0) (P:  0.45%) (Q: -1.01849) (D:  0.000) (U: 0.04682) (Q+U: -0.97168) (V:  -.----)
info string b4d6  (701 ) N:       0 (+ 0) (P:  0.47%) (Q: -1.01849) (D:  0.000) (U: 0.04904) (Q+U: -0.96946) (V:  -.----)
info string f7f8q (1837) N:       0 (+ 0) (P:  0.48%) (Q: -1.01849) (D:  0.000) (U: 0.05019) (Q+U: -0.96830) (V:  -.----)
info string b4b8  (704 ) N:       0 (+ 0) (P:  0.49%) (Q: -1.01849) (D:  0.000) (U: 0.05043) (Q+U: -0.96807) (V:  -.----)
info string f7f8b (1839) N:       0 (+ 0) (P:  0.50%) (Q: -1.01849) (D:  0.000) (U: 0.05221) (Q+U: -0.96628) (V:  -.----)
info string b4d4  (689 ) N:       0 (+ 0) (P:  0.51%) (Q: -1.01849) (D:  0.000) (U: 0.05324) (Q+U: -0.96525) (V:  -.----)
info string b4c3  (685 ) N:       0 (+ 0) (P:  0.51%) (Q: -1.01849) (D:  0.000) (U: 0.05344) (Q+U: -0.96505) (V:  -.----)
info string c2c3  (259 ) N:       0 (+ 0) (P:  0.52%) (Q: -1.01849) (D:  0.000) (U: 0.05414) (Q+U: -0.96436) (V:  -.----)
info string b4b6  (699 ) N:       0 (+ 0) (P:  0.54%) (Q: -1.01849) (D:  0.000) (U: 0.05564) (Q+U: -0.96285) (V:  -.----)
info string b4c5  (696 ) N:       0 (+ 0) (P:  0.55%) (Q: -1.01849) (D:  0.000) (U: 0.05695) (Q+U: -0.96154) (V:  -.----)
info string b4b7  (702 ) N:       0 (+ 0) (P:  0.55%) (Q: -1.01849) (D:  0.000) (U: 0.05725) (Q+U: -0.96125) (V:  -.----)
info string b4a3  (683 ) N:       0 (+ 0) (P:  0.57%) (Q: -1.01849) (D:  0.000) (U: 0.05915) (Q+U: -0.95934) (V:  -.----)
info string b4c4  (688 ) N:       0 (+ 0) (P:  0.62%) (Q: -1.01849) (D:  0.000) (U: 0.06441) (Q+U: -0.95409) (V:  -.----)
info string b4b5  (695 ) N:       0 (+ 0) (P:  0.64%) (Q: -1.01849) (D:  0.000) (U: 0.06651) (Q+U: -0.95198) (V:  -.----)
info string f7f8r (1838) N:       0 (+ 0) (P:  0.72%) (Q: -1.01849) (D:  0.000) (U: 0.07446) (Q+U: -0.94403) (V:  -.----)
info string f7f8n (1544) N:       0 (+ 0) (P:  0.75%) (Q: -1.01849) (D:  0.000) (U: 0.07777) (Q+U: -0.94072) (V:  -.----)
info string b4b3  (684 ) N:       0 (+ 0) (P:  1.07%) (Q: -1.01849) (D:  0.000) (U: 0.11136) (Q+U: -0.90713) (V:  -.----)
info string c2c4  (264 ) N:       0 (+ 0) (P:  1.15%) (Q: -1.01849) (D:  0.000) (U: 0.11977) (Q+U: -0.89872) (V:  -.----)
info string f7e8b (1836) N:       0 (+ 0) (P:  5.66%) (Q: -1.01849) (D:  0.000) (U: 0.58871) (Q+U: -0.42978) (V:  -.----)
info string f7e8r (1835) N:       0 (+ 0) (P:  6.38%) (Q: -1.01849) (D:  0.000) (U: 0.66280) (Q+U: -0.35570) (V:  -.----)
info string f7e8q (1834) N:      12 (+ 0) (P: 75.10%) (Q:  0.10543) (D:  0.000) (U: 0.60058) (Q+U:  0.70602) (V:  0.9015)
bestmove f7e8q ponder e3e8

Which of course is not the best move. Black will capture that queen on E8 for a DRAW.

However, playing with CP and root-move-widening the same net already sees the right move after only 250 nodes:

lc0pr700 --certainty-propagation=true --two-fold-draw-scoring --fpu-strategy-at-root=absolute --fpu-strategy=reduction --fpu-value=-50 --verbose-move-stats -t 1 --minibatch-size=1 --weights=.\nets\41431

go nodes 250
info depth 4 seldepth 4 time 1 nodes 112 score cp 260 hashfull 0 nps 1000 tbhits 0 pv f7e8q e3e8 b2b3 e8e6
info depth 3 seldepth 4 time 73 nodes 135 score cp 249 hashfull 0 nps 328 tbhits 0 pv f7e8q e3e8 b2b3 e8e6
info depth 3 seldepth 5 time 421 nodes 189 score cp 206 hashfull 1 nps 185 tbhits 0 pv f7e8q e3e8 b4c5 e8e5 c5c3
info depth 3 seldepth 5 time 614 nodes 216 score cp 7304 hashfull 1 nps 171 tbhits 0 pv f7f8q f5g6 b4g4 e3g5
info depth 3 seldepth 5 time 669 nodes 237 score cp 7027 hashfull 1 nps 188 tbhits 0 pv f7f8q f5g6 b4g4 e3g5
info string b4b6  (699 ) N:       1 (+ 0) (P:  0.06%) (Q: -0.06936) (D:  0.000) (U: 0.01370) (Q+U: -0.05565) (V: -0.0694)  C:00000000
info string b4f8  (705 ) N:       1 (+ 0) (P:  0.13%) (Q:  0.00461) (D:  0.000) (U: 0.02927) (Q+U:  0.03388) (V:  0.0046)  C:00000000
info string f7f8n (1544) N:       1 (+ 0) (P:  0.24%) (Q: -0.00552) (D:  0.000) (U: 0.05590) (Q+U:  0.05038) (V: -0.0055)  C:00000000
info string f7e8n (1543) N:       1 (+ 0) (P:  0.10%) (Q:  0.03047) (D:  0.000) (U: 0.02223) (Q+U:  0.05270) (V:  0.0305)  C:00000000
info string f7f8b (1839) N:       1 (+ 0) (P:  0.19%) (Q:  0.01338) (D:  0.000) (U: 0.04493) (Q+U:  0.05831) (V:  0.0134)  C:00000000
info string b4e1  (678 ) N:       1 (+ 0) (P:  0.07%) (Q:  0.10526) (D:  0.000) (U: 0.01689) (Q+U:  0.12214) (V:  0.1053)  C:00000000
info string b4e7  (703 ) N:       1 (+ 0) (P:  0.17%) (Q:  0.11117) (D:  0.000) (U: 0.04008) (Q+U:  0.15125) (V:  0.1112)  C:00000000
info string b4b7  (702 ) N:       1 (+ 0) (P:  0.14%) (Q:  0.23167) (D:  0.000) (U: 0.03173) (Q+U:  0.26340) (V:  0.2317)  C:00000000
info string b4e4  (690 ) N:       1 (+ 0) (P:  0.07%) (Q:  0.27672) (D:  0.000) (U: 0.01613) (Q+U:  0.29285) (V:  0.2767)  C:00000000
info string b4a3  (683 ) N:       1 (+ 0) (P:  0.13%) (Q:  0.26639) (D:  0.000) (U: 0.03080) (Q+U:  0.29719) (V:  0.2664)  C:00000000
info string b2a1  (218 ) N:       1 (+ 0) (P:  0.05%) (Q:  0.28719) (D:  0.000) (U: 0.01070) (Q+U:  0.29789) (V:  0.2872)  C:00000000
info string b4b8  (704 ) N:       1 (+ 0) (P:  0.10%) (Q:  0.27792) (D:  0.000) (U: 0.02287) (Q+U:  0.30079) (V:  0.2779)  C:00000000
info string b4d2  (682 ) N:       1 (+ 0) (P:  0.08%) (Q:  0.28344) (D:  0.000) (U: 0.01885) (Q+U:  0.30229) (V:  0.2834)  C:00000000
info string b4h4  (693 ) N:       1 (+ 0) (P:  0.12%) (Q:  0.27828) (D:  0.000) (U: 0.02728) (Q+U:  0.30556) (V:  0.2783)  C:00000000
info string b4d4  (689 ) N:       1 (+ 0) (P:  0.10%) (Q:  0.30348) (D:  0.000) (U: 0.02215) (Q+U:  0.32562) (V:  0.3035)  C:00000000
info string b2a2  (222 ) N:       1 (+ 0) (P:  0.08%) (Q:  0.34627) (D:  0.000) (U: 0.01885) (Q+U:  0.36513) (V:  0.3463)  C:00000000
info string b4a4  (687 ) N:       1 (+ 0) (P:  0.08%) (Q:  0.34842) (D:  0.000) (U: 0.01813) (Q+U:  0.36655) (V:  0.3484)  C:00000000
info string b4c4  (688 ) N:       1 (+ 0) (P:  0.15%) (Q:  0.33344) (D:  0.000) (U: 0.03586) (Q+U:  0.36930) (V:  0.3334)  C:00000000
info string c2c4  (264 ) N:       1 (+ 0) (P:  0.20%) (Q:  0.35766) (D:  0.000) (U: 0.04632) (Q+U:  0.40397) (V:  0.3577)  C:00000000
info string b4b3  (684 ) N:       1 (+ 0) (P:  0.23%) (Q:  0.36690) (D:  0.000) (U: 0.05438) (Q+U:  0.42128) (V:  0.3669)  C:00000000
info string c2c3  (259 ) N:       1 (+ 0) (P:  0.08%) (Q:  0.40817) (D:  0.000) (U: 0.01939) (Q+U:  0.42756) (V:  0.4082)  C:00000000
info string b4c3  (685 ) N:       1 (+ 0) (P:  0.12%) (Q:  0.42718) (D:  0.000) (U: 0.02775) (Q+U:  0.45493) (V:  0.4272)  C:00000000
info string b4f4  (691 ) N:       1 (+ 0) (P:  0.07%) (Q:  0.44051) (D:  0.000) (U: 0.01705) (Q+U:  0.45757) (V:  0.4405)  C:00000000
info string b2b1  (219 ) N:       1 (+ 0) (P:  0.10%) (Q:  0.46209) (D:  0.000) (U: 0.02261) (Q+U:  0.48470) (V:  0.4621)  C:00000000
info string b4a5  (694 ) N:       1 (+ 0) (P:  0.09%) (Q:  0.48107) (D:  0.000) (U: 0.02176) (Q+U:  0.50282) (V:  0.4811)  C:00000000
info string b4g4  (692 ) N:       1 (+ 0) (P:  0.08%) (Q:  0.50672) (D:  0.000) (U: 0.01852) (Q+U:  0.52524) (V:  0.5067)  C:00000000
info string b4d6  (701 ) N:       1 (+ 0) (P:  0.08%) (Q:  0.54416) (D:  0.000) (U: 0.01957) (Q+U:  0.56373) (V:  0.5442)  C:00000000
info string b4c5  (696 ) N:       1 (+ 0) (P:  0.08%) (Q:  0.71128) (D:  0.000) (U: 0.01932) (Q+U:  0.73059) (V:  0.7113)  C:00000000
info string b4b5  (695 ) N:       1 (+ 0) (P:  0.24%) (Q:  0.94032) (D:  0.000) (U: 0.05630) (Q+U:  0.99662) (V:  0.9403)  C:00000000
info string f7e8b (1836) N:       2 (+ 0) (P:  4.93%) (Q:  0.05197) (D:  0.000) (U: 0.76327) (Q+U:  0.81524) (V:  0.0596)  C:00000000
info string f7f8r (1838) N:      28 (+ 0) (P:  0.40%) (Q:  0.98419) (D:  0.000) (U: 0.00634) (Q+U:  0.99053) (V:  0.9865)  C:00000000
info string f7e8r (1835) N:      32 (+ 0) (P:  5.60%) (Q:  0.89438) (D:  0.000) (U: 0.07877) (Q+U:  0.97315) (V:  0.6522)  C:00000000
info string f7e8q (1834) N:      64 (+ 0) (P: 85.40%) (Q:  0.37724) (D:  0.000) (U: 0.61003) (Q+U:  0.98727) (V:  0.6730)  C:00000000
info string f7f8q (1837) N:      81 (+ 0) (P:  0.23%) (Q:  0.98796) (D:  0.000) (U: 0.00132) (Q+U:  0.98928) (V:  0.9960)  C:00000000
bestmove f7f8q ponder f5g6

The fpu-adjustments ensures that the search is not only based on the policy of the net, but explores other nodes more extensively

@MelleKoning MelleKoning closed this Mar 9, 2019
@MelleKoning MelleKoning deleted the PR700PR750 branch March 9, 2019 11:37
@MelleKoning
Copy link
Contributor Author

Now that pr750 is in master, the follow-up is here:
Videodr0me#3

@MelleKoning MelleKoning restored the PR700PR750 branch March 9, 2019 19:43
This was referenced Mar 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants