Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IVS-235 - Bulk insert ModelInstances #326

Merged
merged 4 commits into from
Nov 27, 2024

Conversation

rw-bsi
Copy link
Contributor

@rw-bsi rw-bsi commented Nov 24, 2024

Requires DB constraint to be in place to prevent duplicates / FK constraint errors
(see PR 27)

@rw-bsi rw-bsi changed the title IVS-235 - Bulk insert ModelInstances (WIP) IVS-235 - Bulk insert ModelInstances Nov 24, 2024
@rw-bsi rw-bsi requested review from Ghesselink, civilx64 and aothms and removed request for civilx64 November 24, 2024 19:42
@rw-bsi rw-bsi marked this pull request as ready for review November 24, 2024 19:42
Copy link
Collaborator

@aothms aothms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the combination of ModelInstance and ValidationOutcome bulk_creation better be wrapped in a transaction.atomic? Just so that the chance of orphaned instances is reduced even more.

Also, I still don't really know how this now actually works with transactions across multiple tasks in parallel and the isolation level of the transactions. It would be good to at least conceptually understand when multiple tasks in parallel try to create the same instances simultaneously. Should we set the isolation level to Serializable?

ValidationOutcome.objects.bulk_create(outcomes_to_save)
outcomes_instances_to_save = list()

if outcomes_to_save:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if outcomes_to_save:
if outcomes_to_save and context.validation_task_id is not None:

Any reason why you removed the ... and context.validation_task_id is not None? The code is a bit spaghetti with all the operating modes, I guess it doesn't hurt to keep it in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's at best redundant, worst it silently fails because of a failed assertion (has outcomes, but somehow task_id is lost?) and I prefer to bomb out loudly :-)

features/environment.py Outdated Show resolved Hide resolved
features/environment.py Outdated Show resolved Hide resolved
features/environment.py Outdated Show resolved Hide resolved
rw-bsi and others added 2 commits November 26, 2024 21:38
Co-authored-by: Thomas Krijnen <[email protected]>
Co-authored-by: Thomas Krijnen <[email protected]>
@rw-bsi
Copy link
Contributor Author

rw-bsi commented Nov 26, 2024

I think the combination of ModelInstance and ValidationOutcome bulk_creation better be wrapped in a transaction.atomic? Just so that the chance of orphaned instances is reduced even more.

Also, I still don't really know how this now actually works with transactions across multiple tasks in parallel and the isolation level of the transactions. It would be good to at least conceptually understand when multiple tasks in parallel try to create the same instances simultaneously. Should we set the isolation level to Serializable?

There is a transaction.atomic block at line 109, no harm to push it down, but has same effect?

serializable is not required imo - each instance is linked to one or more outcomes, writing instances with ignore_conflicts everywhere prevents duplicates (by definition only from something running in parallel), and reading them all again makes sure we get current and previously created instances. Key is we do it everwhere the same: schema and gherkin_rules

ModelInstance.objects.bulk_create(outcomes_instances_to_save, ignore_conflicts=True) # ignore conflicts with existing
model_instances = dict(ModelInstance.objects.filter(model_id=model_id).values_list('stepfile_id', 'id')) # retrieve all

@rw-bsi rw-bsi requested a review from aothms November 26, 2024 22:02
@rw-bsi rw-bsi self-assigned this Nov 26, 2024
@aothms aothms merged commit 6f4c7ae into development Nov 27, 2024
1 of 2 checks passed
@aothms aothms deleted the feature/IVS-235_bulk_insert_modelinstances branch November 27, 2024 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants