Skip to content

Commit

Permalink
Fix connection reaper deadlock.
Browse files Browse the repository at this point in the history
Added explicitly running test to detect deadlock.
Use ```go test -tags=TestConnReaperDeadlockLoop -run ^TestConnReaperDeadlockLoop$```.
Be aware the test generate a lot of sockets in TIME_WAIT.
Without patch to socket.go it breaks pretty quick on fast and slow machines.
With the patch it runs way longer but gets slower when there are plenty of sockets in TIME_WAIT.
  • Loading branch information
Sergey Egorov committed Jan 10, 2024
1 parent 16ca7c0 commit b4f365b
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 2 deletions.
11 changes: 9 additions & 2 deletions socket.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import (
"log"
"net"
"os"
"slices"

Check failure on line 14 in socket.go

View workflow job for this annotation

GitHub Actions / main (ubuntu-latest, 1.20.x)

package slices is not in GOROOT (/opt/hostedtoolcache/go/1.20.12/x64/src/slices)
"sort"
"strings"
"sync"
Expand Down Expand Up @@ -384,10 +385,16 @@ func (sck *socket) connReaper() {
return
}

for _, c := range sck.closedConns {
// Clone the known closed connections to avoid data race
// and remove those under reaper unlocked.
// That would be resoling deadlock from #149 simpler way.
cc := slices.Clone(sck.closedConns)
sck.closedConns = nil
sck.reaperCond.L.Unlock()
for _, c := range cc {
sck.rmConn(c)
}
sck.closedConns = nil
sck.reaperCond.L.Lock()
}
}

Expand Down
27 changes: 27 additions & 0 deletions socket_loop_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
//go:build TestConnReaperDeadlockLoop

package zmq4_test

import (
"log"
"testing"
"time"
)

// Use ```go test -tags=TestConnReaperDeadlockLoop -run ^TestConnReaperDeadlockLoop$```
func TestConnReaperDeadlockLoop(t *testing.T) {
n := uint64(0)
for {
n++
if n%100 == 0 {
log.Printf("%d ...", n)
}
timer := time.AfterFunc(30*time.Second, func() {
log.Fatalf("failed at %d!!!", n)
})

TestConnReaperDeadlock(t)

timer.Stop()
}
}

0 comments on commit b4f365b

Please sign in to comment.