Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Spawning of Jobs #51

Open
thevaibhav-dixit opened this issue Mar 18, 2024 · 2 comments · May be fixed by #52
Open

Duplicate Spawning of Jobs #51

thevaibhav-dixit opened this issue Mar 18, 2024 · 2 comments · May be fixed by #52

Comments

@thevaibhav-dixit
Copy link

Due to concurrent polling there were duplicate jobs being spawned. This behaviour is caused because there is no row locking mechanism in place. So when multiple instance ran concurrently they could end up selecting the same messages leading to duplicates
this is the query which i have identified is causing the issue.
Potential Fix:
Use FOR UPDATE to lock the selected rows to prevent concurrent access.

@Diggsey
Copy link
Owner

Diggsey commented Mar 18, 2024

UPDATE queries automatically take row-level locks, but perhaps the locks from the outer query happen too late.

Do you have a test-case that can reproduce the issue?

@bodymindarts
Copy link

The issue is not in the update query - it's in the inner SELECT query. The SELECT runs in its own context and just because the outer UPDATE locks the rows does not prevent the inner SELECT to run concurrently.

We observed duplicate job_ids running after spawning ~ 200 jobs in short succession and having them worked off by 2 workers in parallel.

The PR fixed the behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants