Introduce dual NNUE evaluation #4910

XInTheDark · 2023-12-08T13:19:47Z

Alongside the currently used NNUE net, introduce a much smaller L1-256 net that is used when the material imbalance in the position is high; an improvement over the previously used simple eval. This is then combined with search tuning (SPSA, 44k games at 120+1.2).

For more details, see the original PR: #4898.

Acknowledgements: Many thanks to @mstembera & @linrock for writing the original implementation code, and @linrock for training the net used in this patch!

VLTC: https://tests.stockfishchess.org/tests/view/657075336980e15f69c7998f
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 81342 W: 20137 L: 19788 D: 41417
Ptnml(0-2): 13, 8678, 22933, 9041, 6

VLTC SMP: https://tests.stockfishchess.org/tests/view/657288776980e15f69c7c7e7
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 19796 W: 5038 L: 4772 D: 9986
Ptnml(0-2): 3, 1829, 5965, 2101, 0

Bench: 1216398

bench: 1449578

bench: 1380121

bench 1380121

bench 1440404

Bench: 1440404

Bench: 1216398

github-actions · 2023-12-08T13:20:22Z

clang-format 17 needs to be run on this PR.
If you do not have clang-format installed, the maintainer will run it when merging.
For the exact version please see https://packages.ubuntu.com/mantic/clang-format-17.

(execution 7141862302 / attempt 1)

vdbergh · 2023-12-08T13:30:25Z

Nice result. But there are some issues with this VVLTC testing (currently equivalent to 480/4.8). What if next week someone "simplifies away" the extra net using the standard procedure STC+LTC.... ?

Also: can the same gain be achieved just by using a bigger base net?

mstembera · 2023-12-08T13:38:56Z

Congrats! I think we should also test just the parameter changes in search by themselves at the same VLTC to prove the gain isn't coming just from those alone.

Disservin · 2023-12-08T13:58:30Z

btw the makefile for net2 could be simplified a bit to something like this, to avoid the duplicated rules.

define evaluate_network
	@echo "Default net: $(nnuenet)"
	@if [ "x$(curl_or_wget)" = "x" ]; then \
		echo "Neither curl nor wget is installed. Install one of these tools unless the net has been downloaded manually"; \
	fi
	@if [ "x$(shasum_command)" = "x" ]; then \
		echo "shasum / sha256sum not found, skipping net validation"; \
	elif test -f "$(nnuenet)"; then \
		if [ "$(nnuenet)" != "nn-"`$(shasum_command) $(nnuenet) | cut -c1-12`".nnue" ]; then \
			echo "Removing invalid network"; rm -f $(nnuenet); \
		fi; \
	fi;
	@for nnuedownloadurl in "$(nnuedownloadurl1)" "$(nnuedownloadurl2)"; do \
		if test -f "$(nnuenet)"; then \
			echo "$(nnuenet) available : OK"; break; \
		else \
			if [ "x$(curl_or_wget)" != "x" ]; then \
				echo "Downloading $${nnuedownloadurl}"; $(curl_or_wget) $${nnuedownloadurl} > $(nnuenet);\
			else \
				echo "No net found and download not possible"; exit 1;\
			fi; \
		fi; \
		if [ "x$(shasum_command)" != "x" ]; then \
			if [ "$(nnuenet)" != "nn-"`$(shasum_command) $(nnuenet) | cut -c1-12`".nnue" ]; then \
				echo "Removing failed download"; rm -f $(nnuenet); \
			fi; \
		fi; \
	done
	@if ! test -f "$(nnuenet)"; then \
		echo "Failed to download $(nnuenet)."; \
	fi;
	@if [ "x$(shasum_command)" != "x" ]; then \
		if [ "$(nnuenet)" = "nn-"`$(shasum_command) $(nnuenet) | cut -c1-12`".nnue" ]; then \
			echo "Network validated"; break; \
		fi; \
	fi;
endef

define netvariables =
$(eval nnuenet := $(shell grep $(1) evaluate.h | grep define | sed 's/.*\(nn-[a-z0-9]\{12\}.nnue\).*/\1/'))
$(eval nnuedownloadurl1 := https://tests.stockfishchess.org/api/nn/$(nnuenet))
$(eval nnuedownloadurl2 := https://github.com/official-stockfish/networks/raw/master/$(nnuenet))
$(eval curl_or_wget := $(shell if hash curl 2>/dev/null; then echo "curl -skL"; elif hash wget 2>/dev/null; then echo "wget -qO-"; fi))
$(eval shasum_command := $(shell if hash shasum 2>/dev/null; then echo "shasum -a 256 "; elif hash sha256sum 2>/dev/null; then echo "sha256sum "; fi))
endef

# evaluation network (nnue)
net:
	$(call netvariables, EvalFileDefaultNameBig)
	$(call evaluate_network)
	$(call netvariables, EvalFileDefaultNameSmall)
	$(call evaluate_network)

Disservin · 2023-12-08T14:03:53Z

I've marked it as a draft for now since there are a couple of things to first resolve.

wait for the results of the VLTC patch alone to verify they indeed only gain in combination with the small net.
fix the sanitizer error
refactor the code a bit so that is a bit more maintainable in the future
vdbergh's comment is valid, and if it turns out that this only gains with a parameter tweak at vltc, the dual net will be very fragile since any future tuning could reverse the effects...

and probably some more points which I cant think of right now

Disservin · 2023-12-08T15:52:32Z

btw test is running here https://tests.stockfishchess.org/tests/view/65731ccbf09ce1261f12246e

The SPSA tuning was done for 44k games at 120+1.2. https://tests.stockfishchess.org/tests/view/656ee2a76980e15f69c7767f. Note that the tune was originally done in combination with the recent dual NNUE idea (see #4910). VLTC: https://tests.stockfishchess.org/tests/view/65731ccbf09ce1261f12246e LLR: 2.95 (-2.94,2.94) <0.00,2.00> Total: 52806 W: 13069 L: 12760 D: 26977 Ptnml(0-2): 19, 5498, 15056, 5815, 15 VLTC SMP: https://tests.stockfishchess.org/tests/view/65740ffaf09ce1261f1239ba LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 27630 W: 6934 L: 6651 D: 14045 Ptnml(0-2): 1, 2643, 8243, 2928, 0 Estimated close to neutral at LTC: https://tests.stockfishchess.org/tests/view/6575485a8ec68176cf7d9423 Elo: -0.59 ± 1.8 (95%) LOS: 26.6% Total: 32060 W: 7859 L: 7913 D: 16288 Ptnml(0-2): 20, 3679, 8676, 3645, 10 nElo: -1.21 ± 3.8 (95%) PairsRatio: 0.99 closes #4912 Bench: 1283323

Disservin · 2023-12-10T22:25:42Z

I think we can close this in the meantime, since the gainer likely came from #4912.

official-stockfish#4910 (comment)

The SPSA tuning was done for 44k games at 120+1.2. https://tests.stockfishchess.org/tests/view/656ee2a76980e15f69c7767f. Note that the tune was originally done in combination with the recent dual NNUE idea (see official-stockfish#4910). VLTC: https://tests.stockfishchess.org/tests/view/65731ccbf09ce1261f12246e LLR: 2.95 (-2.94,2.94) <0.00,2.00> Total: 52806 W: 13069 L: 12760 D: 26977 Ptnml(0-2): 19, 5498, 15056, 5815, 15 VLTC SMP: https://tests.stockfishchess.org/tests/view/65740ffaf09ce1261f1239ba LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 27630 W: 6934 L: 6651 D: 14045 Ptnml(0-2): 1, 2643, 8243, 2928, 0 Estimated close to neutral at LTC: https://tests.stockfishchess.org/tests/view/6575485a8ec68176cf7d9423 Elo: -0.59 ± 1.8 (95%) LOS: 26.6% Total: 32060 W: 7859 L: 7913 D: 16288 Ptnml(0-2): 20, 3679, 8676, 3645, 10 nElo: -1.21 ± 3.8 (95%) PairsRatio: 0.99 closes official-stockfish#4912 Bench: 1283323

mstembera and others added 7 commits December 4, 2023 19:50

Dual net NNUE

bdf6b44

bench: 1449578

Smaller 256 net and fixed Makefile by @linrock

f4b75a5

bench: 1380121

Change EvalFileBig back to EvalFile to make fishtest happy

eac74d6

bench: 1380121

hint big nnue below 2000 simple eval

0455d0c

bench 1380121

big below 2200, add stochastic term

6423759

bench 1440404

Try a search tune for dual NNUE.

c234237

Bench: 1440404

v1: Tuned 44k games, use latest linrock smallnet.

e15eec7

Bench: 1216398

Disservin marked this pull request as draft December 8, 2023 13:56

XInTheDark mentioned this pull request Dec 10, 2023

Search parameters tune at VLTC #4912

Closed

Disservin closed this Dec 10, 2023

Disservin mentioned this pull request Dec 11, 2023

Dual NNUE with L1-128 smallnet #4915

Closed

linrock added a commit to linrock/Stockfish that referenced this pull request Dec 11, 2023

Simplify make command with Disservin's patch

c51a371

official-stockfish#4910 (comment)

linrock added a commit to linrock/Stockfish that referenced this pull request Dec 14, 2023

Simplify make command with Disservin's patch

7243ef8

official-stockfish#4910 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce dual NNUE evaluation #4910

Introduce dual NNUE evaluation #4910

XInTheDark commented Dec 8, 2023

github-actions bot commented Dec 8, 2023

vdbergh commented Dec 8, 2023 •

edited

Loading

mstembera commented Dec 8, 2023

Disservin commented Dec 8, 2023

Disservin commented Dec 8, 2023

Disservin commented Dec 8, 2023

Disservin commented Dec 10, 2023

Introduce dual NNUE evaluation #4910

Introduce dual NNUE evaluation #4910

Conversation

XInTheDark commented Dec 8, 2023

github-actions bot commented Dec 8, 2023

vdbergh commented Dec 8, 2023 • edited Loading

mstembera commented Dec 8, 2023

Disservin commented Dec 8, 2023

Disservin commented Dec 8, 2023

Disservin commented Dec 8, 2023

Disservin commented Dec 10, 2023

vdbergh commented Dec 8, 2023 •

edited

Loading