Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: 🐛 Clean removal of output files #6986

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

giancarloromeo
Copy link
Contributor

@giancarloromeo giancarloromeo commented Dec 20, 2024

What do these changes do?

TL;DR: Removing (output) files referenced by Project Nodes, leaves the system in an inconsistent state.

Currently, the WebServer is the 1st responder and forwards the deletion requests to the Storage service, which deletes physically the file(s).
The Projects Nodes are not updated accordingly and Director v2 is not informed about the changes.


Proposal

To maintain decoupling while ensuring consistency between the storage and project services, I recommend using an Event-Driven Architecture:

  • Event Broker (e.g., RabbitMQ):
    • Introduce an event broker as a central communication channel.
    • When a file is deleted from the storage service, it publishes a FileDeleted event with relevant metadata.
  • Storage Service:
    • Handles the physical deletion of files in S3.
    • Keeps a local counter for each file for performance and basic deletion decisions.
    • Publishes the FileDeleted event once the deletion is successful.
    • Subscribes to FileReferenced / FileDereferenced events from the event broker.
    • When a FileReferenced / FileDereferenced event is received, it updates the soft reference count associated with that file.
  • Project Service:
    • Publishes FileReferenced / FileDereferenced events once a file is referenced/dereferenced.
    • Subscribes to FileDeleted events from the event broker.
    • When a FileDeleted event is received, it verifies if the file is referenced in any project and updates its data (e.g., mark the file as deleted, remove reference from Nodes).

Event Types Overview

Event Types Producer(s) Consumer(s)
FileDeleted Storage service Projects service, ...
FileReferenced Projects service, ... Storage service
FileDereferenced Projects service, ... Storage service

Related issue/s

How to test

Dev-ops checklist

@giancarloromeo giancarloromeo self-assigned this Dec 20, 2024
@giancarloromeo giancarloromeo added a:webserver issue related to the webserver service a:storage issue related to storage service labels Dec 20, 2024
@giancarloromeo giancarloromeo added this to the Event Horizon milestone Dec 20, 2024
@giancarloromeo giancarloromeo changed the title 🐛 Clean output files removal WIP: 🐛 Clean output files removal Dec 20, 2024
Copy link

codecov bot commented Dec 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.49%. Comparing base (493488c) to head (abc4ce0).

❗ There is a different number of reports uploaded between BASE (493488c) and HEAD (abc4ce0). Click for more details.

HEAD has 26 uploads less than BASE
Flag BASE (493488c) HEAD (abc4ce0)
unittests 29 3
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6986      +/-   ##
==========================================
- Coverage   85.61%   77.49%   -8.13%     
==========================================
  Files        1621      649     -972     
  Lines       63993    31679   -32314     
  Branches     2035      262    -1773     
==========================================
- Hits        54789    24550   -30239     
+ Misses       8871     7069    -1802     
+ Partials      333       60     -273     
Flag Coverage Δ
integrationtests 64.86% <ø> (+0.27%) ⬆️
unittests 79.91% <ø> (-4.49%) ⬇️
Components Coverage Δ
api ∅ <ø> (∅)
pkg_aws_library ∅ <ø> (∅)
pkg_dask_task_models_library ∅ <ø> (∅)
pkg_models_library ∅ <ø> (∅)
pkg_notifications_library ∅ <ø> (∅)
pkg_postgres_database ∅ <ø> (∅)
pkg_service_integration ∅ <ø> (∅)
pkg_service_library ∅ <ø> (∅)
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 77.37% <ø> (-8.02%) ⬇️
agent ∅ <ø> (∅)
api_server ∅ <ø> (∅)
autoscaling ∅ <ø> (∅)
catalog ∅ <ø> (∅)
clusters_keeper ∅ <ø> (∅)
dask_sidecar ∅ <ø> (∅)
datcore_adapter ∅ <ø> (∅)
director ∅ <ø> (∅)
director_v2 78.75% <ø> (-12.67%) ⬇️
dynamic_scheduler ∅ <ø> (∅)
dynamic_sidecar 59.86% <ø> (-29.89%) ⬇️
efs_guardian ∅ <ø> (∅)
invitations ∅ <ø> (∅)
osparc_gateway_server ∅ <ø> (∅)
payments ∅ <ø> (∅)
resource_usage_tracker ∅ <ø> (∅)
storage ∅ <ø> (∅)
webclient ∅ <ø> (∅)
webserver 79.88% <ø> (-0.02%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 493488c...abc4ce0. Read the comment docs.

@giancarloromeo giancarloromeo changed the title WIP: 🐛 Clean output files removal WIP: 🐛 Clean removal of output files Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:storage issue related to storage service a:webserver issue related to the webserver service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Removing a linked output file in a service, makes the next service unrunnable without clear reason
1 participant