Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IDP deadlock due to recursive locking #553

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

agis
Copy link

@agis agis commented Feb 14, 2024

While using the IDP server for integration tests, I've experienced a deadlock after issuing many concurrent requests to the following endpoints:

  • PUT /services/:id
  • GET /sso

Apparently, the issue is that the IDP code attempts to acquires locks on the Server.idpConfigMu recursively:

  1. The /sso HTTP handler calls Server.idpConfigMu.RLock()
  2. Before releasing the above lock, this same handler eventually calls s.GetServiceProvider(), which in turn attempts to acquire the same lock by calling Server.idpConfigMu.RLock()
  3. Concurrent requests to PUT /services/:id acquire a write lock by calling Server.idpConfigMu.Lock()

Step (2) is a recursive lock, which won't work, as stated by the documentation of sync.RWLock[1]:

If a goroutine holds a RWMutex for reading and another goroutine
might call Lock, no goroutine should expect to be able to acquire a
read lock until the initial read lock is released. In particular,
this prohibits recursive read locking. This is to ensure that the
lock eventually becomes available; a blocked Lock call excludes new
readers from acquiring the lock.

This patch fixes the issue by removing the unnecessary lock acquired in step (1). This was redundant since GetServiceProvider() already takes care of serializing access to the s.serviceProviders map, which is the resource in question.

[1] https://golang.org/pkg/sync/#RWMutex

While using the IDP server for integration tests, I've experienced a
deadlock after issuing many concurrent requests to the following
endpoints:

- PUT /services/:id
- GET /sso

Apparently, the issue is that the IDP code attempts to acquires locks on
the Server.idpConfigMu recursively:

1. The /sso HTTP handler calls Server.idpConfigMu.RLock()
2. Before releasing the above lock, this same handler eventually calls
   s.GetServiceProvider, which in turn attempts to acquire the same lock
   by calling Server.idpConfigMu.RLock()
3. Concurrent requests to `PUT /services/:id` acquire a write lock by
   calling Server.idpConfigMu.Lock()

Step (2) is a recursive lock, which won't work, as stated by the
documentation of sync.RWLock[1]:

    If a goroutine holds a RWMutex for reading and another goroutine
    might call Lock, no goroutine should expect to be able to acquire a
    read lock until the initial read lock is released. In particular,
    this prohibits recursive read locking. This is to ensure that the
    lock eventually becomes available; a blocked Lock call excludes new
    readers from acquiring the lock.

This patch fixes the issue by removing the unnecessary lock acquired in
step (1). This was redundant since GetServiceProvider() already takes
care of serializing access to the s.serviceProviders map, which is the
resource in question.

[1] https://golang.org/pkg/sync/#RWMutex
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant