Poor use of memory / too many allocations in `HashLeaf`, leading to GC to work in overdrive #216

musalbas · 2023-06-28T15:52:49Z

More time is spent in HashLeaf allocating, than hashing:

https://github.com/celestiaorg/nmt/blob/master/hasher.go#L190
https://flamegraph.com/share/a54e891a-1114-11ee-b13f-de9431916b05

Possibly related:

In celestia-node, min and max namespace take up 200MB (20%) of RAM

musalbas · 2023-06-28T15:56:57Z

If the same allocation is reused in HashLeaf--would that make it not threadsafe? If so, should be considered, I believe trees are often computed in parallel.

liamsi · 2023-06-28T15:57:27Z

I would try the folllwing fwiw. Change these appends

nmt/hasher.go

Lines 190 to 192 in 564300a

    
           minMaxNIDs := make([]byte, 0, resLen) 
        
           minMaxNIDs = append(minMaxNIDs, nID...) // nID 
        
           minMaxNIDs = append(minMaxNIDs, nID...) // nID || nID

to

minMaxNIDs :=  append(append(make([]byte, 0, resLen), nID...), nID...)

and see if that removes the need for additional allocs. Similar for the other appends:

nmt/hasher.go

Lines 195 to 197 in 564300a

    
           leafPrefixedNData := make([]byte, 0, len(ndata)+1) 
        
           leafPrefixedNData = append(leafPrefixedNData, LeafPrefix) 
        
           leafPrefixedNData = append(leafPrefixedNData, ndata...)

liamsi · 2023-06-28T15:59:40Z

Also, it is important to use benchstat for any related changes (since we disabled go-bencher recently): https://pkg.go.dev/golang.org/x/perf/cmd/benchstat

Should include both mem and CPU.

liamsi · 2023-06-28T16:38:14Z

This could be somewhat related: #212

staheri14 · 2024-12-02T18:48:18Z

Unassigning myself to allow someone else to take over.

In this issue #216, it was raised that 1. More time was spent allocating than hashing in the HashLeaf function 2. The min and max namespace functions were taking up a large amount of RAM --- ## More time was spent allocating than hashing in the HashLeaf function The benchmark test described in the issue listed above no longer exists, and when I run the `BenchmarkComputeRoot` in the nmt package [HERE](https://github.com/bruscar010/nmt/blob/main/nmt_test.go#L696) I can't replicate memory allocation taking longer than the hashing, although this may be related to my test parameters. Below are FlameGraphs of CPU/Mem usage (I've updated the Benchmark to have 20k leaf nodes with 512 bytes transaction size) [Base Mem Flamegraph](https://flamegraph.com/share/6a04ffab-c866-11ef-9832-26c3e5347170) [Base CPU Flamegraph](https://flamegraph.com/share/efce1d16-c868-11ef-b027-3a58f36d25ce) ``` BenchmarkComputeRoot/64-leaves-12 100 141785 ns/op 65166 B/op 1535 allocs/op BenchmarkComputeRoot/128-leaves-12 100 280652 ns/op 123894 B/op 3074 allocs/op BenchmarkComputeRoot/256-leaves-12 100 501392 ns/op 254611 B/op 6154 allocs/op ``` I've instead focused on the second issue --- ## The min and max namespace functions were taking up a large amount of RAM ### Solution 1: Reduce the number of calls to MinNamespace & MaxNamespace The HashNode function made 4 duplicate function calls to MaxNamespace/MinNamespace for the same node. ``` 1. HashNode -> ValidateNodes -> ValidateNodeFormat.Min/MaxNamespace 2. HashNode -> ValidateNodes -> validateSiblingsNamespaceOrder -> ValidateNodeFormat(left|right).Min/MaxNamespace 3. HashNode -> ValidateNodes -> validateSiblingsNamespaceOrder.Min/MaxNamespace 4. HashNode.Min/MaxNamespace ``` After reducing to a single call, here is the new benchmark output (running `go test -run="none" -bench=. -benchtime=100x -benchmem`) ``` BenchmarkComputeRoot/64-leaves-12 100 137292 ns/op 60100 B/op 905 allocs/op BenchmarkComputeRoot/128-leaves-12 100 237664 ns/op 113758 B/op 1804 allocs/op BenchmarkComputeRoot/256-leaves-12 100 462802 ns/op 234241 B/op 3604 allocs/op ``` ### Solution 2: Update the MinNamespace/MaxNamespace to return a slice of the existing array rather than creating a new underlying array Copying the slice stops the possibility [array capacity memory leaks](https://itnext.io/mastering-memory-management-in-go-avoiding-slice-related-leaks-7d5e97ee48d4) if for some reason the namespace ID variable's lifetime is longer than the lifetime of the original slice, but I don't think that is the case. [Slicing with a max capacity also stops the potential for appends to the namespace ID slice editing the underlying](https://build-your-own.org/blog/20230316_go_full_slice/) `ns|ns|hash` array. Benchmark output after this change ``` BenchmarkComputeRoot/64-leaves-12 100 136436 ns/op 58180 B/op 653 allocs/op BenchmarkComputeRoot/128-leaves-12 100 241836 ns/op 109747 B/op 1297 allocs/op BenchmarkComputeRoot/256-leaves-12 100 470231 ns/op 225924 B/op 2583 allocs/op ``` [New Mem Flamegraph](https://flamegraph.com/share/71b8ef18-c86a-11ef-b027-3a58f36d25ce) [New CPU Flamegraph](https://flamegraph.com/share/7f46d1b4-c86a-11ef-9832-26c3e5347170) --- The improvements made by the first solution are made redundant by the second, but I was initially reluctant to replace the slice copy in case it would lead to a memory leak. As I got more familiar with the package, I became more confident that a slicing the array may be more appropriate. **Note** - Given that validation is done in the `Push` method, adding an `UnsafeHashNode` that skips validation was also a method I considered, but I felt that this could lead to more complications & would not make any notable differences compared to either of the other solutions. - After reworking the `ValidateSiblings` function, I also removed the associated tests, but it looks like the same test cases also exist in `TestValidateNodes` [HERE](https://github.com/bruscar010/nmt/blob/main/hasher_test.go#L684)  ## Summary by CodeRabbit - **New Features** - Enhanced namespace validation methods with improved error reporting and context. - Introduced `nsIDRange` struct for better namespace ID range management. - **Refactor** - Streamlined namespace-related functions in `MinNamespace` and `MaxNamespace`. - Improved code readability and maintainability in namespace merkle tree hasher. - **Tests** - Added a benchmark for `ComputeRoot` with 20,000 leaves. - Removed existing sibling node validation test.  --------- Co-authored-by: francis <[email protected]>

cristaloleg · 2025-01-24T15:37:26Z

Should we close after #285 & #287 ?

Wondertan · 2025-01-24T17:10:34Z

I would also run bump node and run it with the change to see in pyroscope the diff

musalbas changed the title ~~Poor use of memory / too many allocations for nmt.MinNamespace and nmt.MaxNamespace, leading to GC to work in overdrive~~ Poor use of memory / too many allocations, leading to GC to work in overdrive Jun 28, 2023

musalbas changed the title ~~Poor use of memory / too many allocations, leading to GC to work in overdrive~~ Poor use of memory / too many allocations in HashLeaf, leading to GC to work in overdrive Jun 28, 2023

staheri14 self-assigned this Jun 28, 2023

Wondertan mentioned this issue Mar 5, 2024

Benchmark computeRoots function for large square sizes celestiaorg/rsmt2d#303

Closed

staheri14 removed their assignment Dec 2, 2024

bruscar010 mentioned this issue Jan 1, 2025

fix: MinNamespace/MaxNamespace optimizations bruscar010/nmt#1

Open

bruscar010 mentioned this issue Jan 16, 2025

fix: reduce calls to MinNamespace & MaxNamespace #285

Merged

cristaloleg mentioned this issue Jan 25, 2025

deps(go.mod): bump nmt v0.23.0 celestiaorg/celestia-node#4069

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor use of memory / too many allocations in `HashLeaf`, leading to GC to work in overdrive #216

Poor use of memory / too many allocations in `HashLeaf`, leading to GC to work in overdrive #216

musalbas commented Jun 28, 2023 •

edited

Loading

musalbas commented Jun 28, 2023

liamsi commented Jun 28, 2023

liamsi commented Jun 28, 2023 •

edited

Loading

liamsi commented Jun 28, 2023

staheri14 commented Dec 2, 2024

cristaloleg commented Jan 24, 2025

Wondertan commented Jan 24, 2025

Poor use of memory / too many allocations in HashLeaf, leading to GC to work in overdrive #216

Poor use of memory / too many allocations in HashLeaf, leading to GC to work in overdrive #216

Comments

musalbas commented Jun 28, 2023 • edited Loading

musalbas commented Jun 28, 2023

liamsi commented Jun 28, 2023

liamsi commented Jun 28, 2023 • edited Loading

liamsi commented Jun 28, 2023

staheri14 commented Dec 2, 2024

cristaloleg commented Jan 24, 2025

Wondertan commented Jan 24, 2025

Poor use of memory / too many allocations in `HashLeaf`, leading to GC to work in overdrive #216

Poor use of memory / too many allocations in `HashLeaf`, leading to GC to work in overdrive #216

musalbas commented Jun 28, 2023 •

edited

Loading

liamsi commented Jun 28, 2023 •

edited

Loading