Skip to content

Commit

Permalink
fix(share/availability): add missing protection for autobatch (#3988)
Browse files Browse the repository at this point in the history
Initially reported as panic by @Wondertan 
```
fatal error: concurrent map iteration and map write goroutine 424607 [running]: [github.com/ipfs/go-datastore/autobatch.(*Datastore).Flush(0xc0015ff300](http://github.com/ipfs/go-datastore/autobatch.(*Datastore).Flush(0xc0015ff300), {0x3aacc20, 0xc00bb3b340}) /go/pkg/mod/github.com/ipfs/[email protected]/autobatch/autobatch.go:108 +0xa5 [github.com/ipfs/go-datastore/autobatch.(*Datastore).Put(...)](http://github.com/ipfs/go-datastore/autobatch.(*Datastore).Put(...)) /go/pkg/mod/github.com/ipfs/[email protected]/autobatch/autobatch.go:67 [github.com/celestiaorg/celestia-node/share/availability/light.(*ShareAvailability).SharesAvailable(0xc00050ba40](http://github.com/celestiaorg/celestia-node/share/availability/light.(*ShareAvailability).SharesAvailable(0xc00050ba40), {0x3aacc20, 0xc00bb3b340}, 0xc00694c180) /src/share/availability/light/availability.go:154 +0xab9 [github.com/celestiaorg/celestia-node/das.(*DASer).sample(0xc0014da1a0](http://github.com/celestiaorg/celestia-node/das.(*DASer).sample(0xc0014da1a0), {0x3aacc20, 0xc00bb3b340}, 0xc00694c180) /src/das/daser.go:171 +0xc9 [github.com/celestiaorg/celestia-node/das.(*worker).sample(0xc00bdb6480](http://github.com/celestiaorg/celestia-node/das.(*worker).sample(0xc00bdb6480), {0x3aacbb0, 0xc002286a00}, 0x2540be4000, 0xc00fa7feff?) /src/das/worker.go:121 +0xe2 [github.com/celestiaorg/celestia-node/das.(*worker).run(0xc00bdb6480](http://github.com/celestiaorg/celestia-node/das.(*worker).run(0xc00bdb6480), {0x3aacbb0, 0xc002286a00}, 0x2540be4000, 0xc0018fa3f0) /src/das/worker.go:81 +0x254 [github.com/celestiaorg/celestia-node/das.(*samplingCoordinator).runWorker.func1()](http://github.com/celestiaorg/celestia-node/das.(*samplingCoordinator).runWorker.func1()) /src/das/coordinator.go:112 +0x68 created by [github.com/celestiaorg/celestia-node/das.(*samplingCoordinator).runWorker](http://github.com/celestiaorg/celestia-node/das.(*samplingCoordinator).runWorker) in goroutine 560 /src/das/coordinator.go:110 +0x317
```

The panic was caused by concurrent access to autobatch datastore due to lack of sync protection on Flush call. This PR adds missing mutex protection for Flush call, as well as Delete, which also had protection missing .
  • Loading branch information
walldiss authored Dec 9, 2024
1 parent c5008b6 commit 250c129
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions share/availability/light/availability.go
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,9 @@ func (la *ShareAvailability) Prune(ctx context.Context, h *header.ExtendedHeader
}

// delete the sampling result
la.dsLk.Lock()
err = la.ds.Delete(ctx, key)
la.dsLk.Unlock()
if err != nil {
return fmt.Errorf("delete sampling result: %w", err)
}
Expand All @@ -235,5 +237,7 @@ func datastoreKeyForRoot(root *share.AxisRoots) datastore.Key {

// Close flushes all queued writes to disk.
func (la *ShareAvailability) Close(ctx context.Context) error {
la.dsLk.Lock()
defer la.dsLk.Unlock()
return la.ds.Flush(ctx)
}

0 comments on commit 250c129

Please sign in to comment.