Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce dual NNUE evaluation #4910

Closed

Conversation

XInTheDark
Copy link
Contributor

Alongside the currently used NNUE net, introduce a much smaller L1-256 net that is used when the material imbalance in the position is high; an improvement over the previously used simple eval. This is then combined with search tuning (SPSA, 44k games at 120+1.2).

For more details, see the original PR: #4898.

Acknowledgements: Many thanks to @mstembera & @linrock for writing the original implementation code, and @linrock for training the net used in this patch!

VLTC: https://tests.stockfishchess.org/tests/view/657075336980e15f69c7998f
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 81342 W: 20137 L: 19788 D: 41417
Ptnml(0-2): 13, 8678, 22933, 9041, 6

VLTC SMP: https://tests.stockfishchess.org/tests/view/657288776980e15f69c7c7e7
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 19796 W: 5038 L: 4772 D: 9986
Ptnml(0-2): 3, 1829, 5965, 2101, 0

Bench: 1216398

Copy link

github-actions bot commented Dec 8, 2023

clang-format 17 needs to be run on this PR.
If you do not have clang-format installed, the maintainer will run it when merging.
For the exact version please see https://packages.ubuntu.com/mantic/clang-format-17.

(execution 7141862302 / attempt 1)

@vdbergh
Copy link
Contributor

vdbergh commented Dec 8, 2023

Nice result. But there are some issues with this VVLTC testing (currently equivalent to 480/4.8). What if next week someone "simplifies away" the extra net using the standard procedure STC+LTC.... ?

Also: can the same gain be achieved just by using a bigger base net?

@mstembera
Copy link
Contributor

Congrats! I think we should also test just the parameter changes in search by themselves at the same VLTC to prove the gain isn't coming just from those alone.

@Disservin Disservin marked this pull request as draft December 8, 2023 13:56
@Disservin
Copy link
Member

btw the makefile for net2 could be simplified a bit to something like this, to avoid the duplicated rules.

define evaluate_network
	@echo "Default net: $(nnuenet)"
	@if [ "x$(curl_or_wget)" = "x" ]; then \
		echo "Neither curl nor wget is installed. Install one of these tools unless the net has been downloaded manually"; \
	fi
	@if [ "x$(shasum_command)" = "x" ]; then \
		echo "shasum / sha256sum not found, skipping net validation"; \
	elif test -f "$(nnuenet)"; then \
		if [ "$(nnuenet)" != "nn-"`$(shasum_command) $(nnuenet) | cut -c1-12`".nnue" ]; then \
			echo "Removing invalid network"; rm -f $(nnuenet); \
		fi; \
	fi;
	@for nnuedownloadurl in "$(nnuedownloadurl1)" "$(nnuedownloadurl2)"; do \
		if test -f "$(nnuenet)"; then \
			echo "$(nnuenet) available : OK"; break; \
		else \
			if [ "x$(curl_or_wget)" != "x" ]; then \
				echo "Downloading $${nnuedownloadurl}"; $(curl_or_wget) $${nnuedownloadurl} > $(nnuenet);\
			else \
				echo "No net found and download not possible"; exit 1;\
			fi; \
		fi; \
		if [ "x$(shasum_command)" != "x" ]; then \
			if [ "$(nnuenet)" != "nn-"`$(shasum_command) $(nnuenet) | cut -c1-12`".nnue" ]; then \
				echo "Removing failed download"; rm -f $(nnuenet); \
			fi; \
		fi; \
	done
	@if ! test -f "$(nnuenet)"; then \
		echo "Failed to download $(nnuenet)."; \
	fi;
	@if [ "x$(shasum_command)" != "x" ]; then \
		if [ "$(nnuenet)" = "nn-"`$(shasum_command) $(nnuenet) | cut -c1-12`".nnue" ]; then \
			echo "Network validated"; break; \
		fi; \
	fi;
endef

define netvariables =
$(eval nnuenet := $(shell grep $(1) evaluate.h | grep define | sed 's/.*\(nn-[a-z0-9]\{12\}.nnue\).*/\1/'))
$(eval nnuedownloadurl1 := https://tests.stockfishchess.org/api/nn/$(nnuenet))
$(eval nnuedownloadurl2 := https://github.com/official-stockfish/networks/raw/master/$(nnuenet))
$(eval curl_or_wget := $(shell if hash curl 2>/dev/null; then echo "curl -skL"; elif hash wget 2>/dev/null; then echo "wget -qO-"; fi))
$(eval shasum_command := $(shell if hash shasum 2>/dev/null; then echo "shasum -a 256 "; elif hash sha256sum 2>/dev/null; then echo "sha256sum "; fi))
endef

# evaluation network (nnue)
net:
	$(call netvariables, EvalFileDefaultNameBig)
	$(call evaluate_network)
	$(call netvariables, EvalFileDefaultNameSmall)
	$(call evaluate_network)

@Disservin
Copy link
Member

I've marked it as a draft for now since there are a couple of things to first resolve.

  • wait for the results of the VLTC patch alone to verify they indeed only gain in combination with the small net.
  • fix the sanitizer error
  • refactor the code a bit so that is a bit more maintainable in the future
  • vdbergh's comment is valid, and if it turns out that this only gains with a parameter tweak at vltc, the dual net will be very fragile since any future tuning could reverse the effects...

and probably some more points which I cant think of right now

@Disservin
Copy link
Member

Disservin pushed a commit that referenced this pull request Dec 10, 2023
The SPSA tuning was done for 44k games at 120+1.2.
https://tests.stockfishchess.org/tests/view/656ee2a76980e15f69c7767f.

Note that the tune was originally done in combination with the recent dual NNUE
idea (see #4910).

VLTC:
https://tests.stockfishchess.org/tests/view/65731ccbf09ce1261f12246e
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 52806 W: 13069 L: 12760 D: 26977
Ptnml(0-2): 19, 5498, 15056, 5815, 15

VLTC SMP:
https://tests.stockfishchess.org/tests/view/65740ffaf09ce1261f1239ba
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 27630 W: 6934 L: 6651 D: 14045
Ptnml(0-2): 1, 2643, 8243, 2928, 0

Estimated close to neutral at LTC:
https://tests.stockfishchess.org/tests/view/6575485a8ec68176cf7d9423
Elo: -0.59 ± 1.8 (95%) LOS: 26.6%
Total: 32060 W: 7859 L: 7913 D: 16288
Ptnml(0-2): 20, 3679, 8676, 3645, 10
nElo: -1.21 ± 3.8 (95%) PairsRatio: 0.99

closes #4912

Bench: 1283323
@Disservin
Copy link
Member

I think we can close this in the meantime, since the gainer likely came from #4912.

@Disservin Disservin closed this Dec 10, 2023
linrock added a commit to linrock/Stockfish that referenced this pull request Dec 11, 2023
linrock added a commit to linrock/Stockfish that referenced this pull request Dec 14, 2023
rn5f107s2 pushed a commit to rn5f107s2/Stockfish that referenced this pull request Jan 14, 2024
The SPSA tuning was done for 44k games at 120+1.2.
https://tests.stockfishchess.org/tests/view/656ee2a76980e15f69c7767f.

Note that the tune was originally done in combination with the recent dual NNUE
idea (see official-stockfish#4910).

VLTC:
https://tests.stockfishchess.org/tests/view/65731ccbf09ce1261f12246e
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 52806 W: 13069 L: 12760 D: 26977
Ptnml(0-2): 19, 5498, 15056, 5815, 15

VLTC SMP:
https://tests.stockfishchess.org/tests/view/65740ffaf09ce1261f1239ba
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 27630 W: 6934 L: 6651 D: 14045
Ptnml(0-2): 1, 2643, 8243, 2928, 0

Estimated close to neutral at LTC:
https://tests.stockfishchess.org/tests/view/6575485a8ec68176cf7d9423
Elo: -0.59 ± 1.8 (95%) LOS: 26.6%
Total: 32060 W: 7859 L: 7913 D: 16288
Ptnml(0-2): 20, 3679, 8676, 3645, 10
nElo: -1.21 ± 3.8 (95%) PairsRatio: 0.99

closes official-stockfish#4912

Bench: 1283323
windfishballad pushed a commit to windfishballad/Stockfish that referenced this pull request Jan 23, 2024
The SPSA tuning was done for 44k games at 120+1.2.
https://tests.stockfishchess.org/tests/view/656ee2a76980e15f69c7767f.

Note that the tune was originally done in combination with the recent dual NNUE
idea (see official-stockfish#4910).

VLTC:
https://tests.stockfishchess.org/tests/view/65731ccbf09ce1261f12246e
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 52806 W: 13069 L: 12760 D: 26977
Ptnml(0-2): 19, 5498, 15056, 5815, 15

VLTC SMP:
https://tests.stockfishchess.org/tests/view/65740ffaf09ce1261f1239ba
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 27630 W: 6934 L: 6651 D: 14045
Ptnml(0-2): 1, 2643, 8243, 2928, 0

Estimated close to neutral at LTC:
https://tests.stockfishchess.org/tests/view/6575485a8ec68176cf7d9423
Elo: -0.59 ± 1.8 (95%) LOS: 26.6%
Total: 32060 W: 7859 L: 7913 D: 16288
Ptnml(0-2): 20, 3679, 8676, 3645, 10
nElo: -1.21 ± 3.8 (95%) PairsRatio: 0.99

closes official-stockfish#4912

Bench: 1283323
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants