Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foreign key constraint violation when storing trigger to non-durable job on MariaDB/MySQL #1085

Open
t-beckmann opened this issue Dec 8, 2023 · 0 comments · May be fixed by #1086
Open

Foreign key constraint violation when storing trigger to non-durable job on MariaDB/MySQL #1085

t-beckmann opened this issue Dec 8, 2023 · 0 comments · May be fixed by #1086

Comments

@t-beckmann
Copy link

When scheduling a trigger for a non-durable job it may happen that the foreign key of trigger to job is violated on MariaDB/MySQL. This can happen if another trigger for the job completes concurrently while storing the job with the replace option set.

MariaDB/MySQL implement consistent nonlocking reads and multi-versioned concurrency control, see https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.html. The JobStoreSupport.storeJob(Connection, JobDetail, boolean) method uses an if-exists-update-else-insert approach that in case above exhibits a race specific to the way InnoDB is implemented. Here is what happens:

  1. The SELECT of jobExists reads the existing job row from the database. This is a nonlocking read.
  2. Now, job execution completes in a concurrent transaction. The trigger of that execution gets deleted and since it is the last one referencing the non-durable job, the job is deleted as well. This concurrent operation commits.
  3. The operation of step 1 continues, the updateJobDetail issues an UPDATE statement. Because of multi-versioned concurrency control the delete of step 2 becomes visible, and the update does not hit any row.
  4. Next, a trigger for the job is inserted which results in the foreign key constraint violation because the job does not exist.

I was above to reproduce the above more or less reliably in integration test of a larger software system we develop. The race condition hits us in 1 out of 100 scheduleJob(jobDetail, trigger, true) operations. I'll add a PR fixing this by checking the returned row count of the update statement shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant