Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement fast/shallow node copy #136

Open
MrCreosote opened this issue Jan 16, 2022 · 0 comments
Open

Implement fast/shallow node copy #136

MrCreosote opened this issue Jan 16, 2022 · 0 comments

Comments

@MrCreosote
Copy link
Member

MrCreosote commented Jan 16, 2022

Currently copying a node entails copying the S3 file and then copying the Mongo node. This deep copy occurs because if two nodes point to the same S3 file and one of them is deleted, the delete operation will delete the S3 file and leave the remaining node with a dangling reference.

To allow for shallow copies, which should take milliseconds as opposed to potentially hours:

On a delete request:

  • Place a delete event for the S3 file in a new mongo collection uniquely indexed by the S3 ID with a timestamp.
    • Any new deletion events for that node should update the timestamp atomically, e.g. $upsert.
  • Delete the node, but not the S3 file.
  • Have a thread that checks for delete events that are older than some reasonable time such that we can be sure any copy requests have completed.
  • If any delete events are active, check if there are any nodes with pointers to the S3 file.
  • After the check, ensure the event timestamp hasn't changed. If it has, abort.
    • This prevents the case where a node is deleted with a copy event in progress after the check starts. If that happens, the file could be deleted even though the copy will produce a pointer to the file.
  • If pointers to the file exist, delete the delete event if the timestamp hasn't changed, otherwise do nothing.
    • Needs an index on the file pointer
  • If not, delete the S3 file and then delete the delete event.

On a node copy, create a new node pointing to the same S3 file rather than copying the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant