Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed sqlserver not actually getting a lock if lock is already set #1186

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

urbim
Copy link

@urbim urbim commented Oct 11, 2024

This PR fixes the SQL Server not actually getting a lock if lock is already set (e.g. by another instance).

As mentioned in this comment by @dhui, the SQL statement for sp_getapplock is not correct.

I tested the locking with this script and added some debugging logging as seen in this commit (debug log is not actually part of this PR, I only used it for testing).

The result before applying the fixes was:

=== Lock 0xc000080400
=== Unlock 0xc000080400
=== Lock 0xc000080400
Locked
=== Lock 0xc0001440a0
=== Unlock 0xc0001440a0
panic: mssql: Cannot release the application lock (Database Principal: 'public', Resource: '983082799') because it is not currently held. in line 0: EXEC sp_releaseapplock @Resource = @p1, @LockOwner = 'Session'

goroutine 1 [running]:
main.main()
        /path/to/golang-migrate-test/golang-migrate-test.go:49 +0x254

The first two lines are from d1, err := p.Open(connectionString). Then d1 acquires the lock (line 3). Then _, err = p.Open(connectionString) // d2 should fail when trying to acquire the second lock, but it does not - the code succeeds but the lock is actually not acquired (line 5). The code actually fails when d2 tries to unlock - it can't because it hasn't acquired the lock in the first place (line 6+).

I'd expect a SELECT statement after the EXEC sp_getapplock statement to be used with sql.Conn.QueryRowContext instead of with sql.Conn.ExecContext().

This is exactly right, the SELECT statement is missing and this PR fixes this, along with some other improvements to the locking:

It's unclear to me what @LockMode (update vs exclusive)

From the documentation the Update should be used when updating resources, which is not the case here. The documentation is not entirely clear to me, but from what I can tell, this is the same as Exclusive with added mechanism, that prevents deadlocks from read-lock-update pattern. I think the Exclusive is the right choice here, so the PR also changes that.

and @LockOwner (transaction vs session) we should be using.

From the documentation:

Locks associated with the current transaction are released when the transaction commits or rolls back. Locks associated with the session are released when the session is logged out.

Since we are acquiring lock outside the transaction, the Session is the correct option to use. I actually tried using the Transaction value as the LockOwner and the result was -999, which "Indicates a parameter validation or other call error".

I also set the timeout of the sp_getapplock to infinite, similar to what is used in the postgres implementation. Since migrations can take a long time (e.g. updating an index on a large database), I think that the only good options for timeout are either infinite or somehow configurable by the user. Since the second option is probably not feasible in the current architecture of this library, I opted for the first one. One PR is already open addressing this timeout, but it would still incorrectly obtain the lock if another instance was holding the lock for more than 10 seconds.

This PR should close #1123 and #253.

I didn't manage to get the tests working locally, but I'm hoping the tests will pass in CI.

Lastly, the result of the above script with fixes applied (and non-infinite timeout for testing):

=== Lock 0xc00043a0a0
=== Unlock 0xc00043a0a0
=== Lock 0xc00043a0a0
Locked
=== Lock 0xc0000800c0
panic: try lock failed with error -1: The lock request timed out. in line 0: DECLARE @lockResult int; EXEC @lockResult = sp_getapplock @Resource = @p1, @LockMode = 'Exclusive', @LockOwner = 'Session', @LockTimeout = 1000; SELECT @lockResult; (details: <nil>)

goroutine 1 [running]:
main.main()
        /path/to/golang-migrate-test.go:42 +0x13c

When d2 tries to acquire the lock, it fails with -1: The lock request timed out after timeout as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant