Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: add command to update users disk quota usage #92

Merged

Conversation

mvidalgarcia
Copy link
Member

@mvidalgarcia mvidalgarcia commented Oct 7, 2020

@mvidalgarcia mvidalgarcia force-pushed the rwc/333-nightly-disk-update branch from ad8ee37 to 973d442 Compare October 7, 2020 13:45
@@ -523,6 +511,7 @@ def workflow_status_change_listener(workflow, new_status, old_status, initiator)
user_resource_quota.quota_used += cpu_milliseconds
Session.add(workflow_resource)
Session.commit()
update_users_disk_quota(user=workflow.owner)
Copy link
Member Author

@mvidalgarcia mvidalgarcia Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This update probably needs to be done in some other places:

  • Workflow deletion
  • File upload (reana-client upload -w roofit myscript.py)
  • Some others?

WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing, here we're calculating and storing WorkflowResource for the CPU but we're no calculating that anywhere else for disk. How should we address this? I think for disk is trickier as it's not necessarily accumulative as a user can add and remove files from the workflow workspace. Probably in this case it needs to be calculated every time? I think the value corresponds to get_workspace_disk_usage function directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think:

  1. Workflow deletion (reana-client delete -w w1) -> rerun the full calculation

  2. File upload/deletion (reana-client rm/reana-client upload) -> we can run the full calculation for now but, this can become dangerous:

    • Doing this calculation on every upload could be a very expensive server-side operation, as we upload each file on a different HTTP request.
    • We know how big those files are when uploaded, so in theory we can just increment/decrement the disk usage by the number of bytes the files occupy at workflow and user level.

    We could deal with this outside this PR group, as a standalone issue.

  3. Interactive session closed (reana-client close -w w1) -> run full calculation, no way around it

  4. Workflow finishes/fails/stops (addressing the second question)-> run a full calculation, and overwrite the value in WorkflowResource using get_workspace_disk_usage even at the risk of not being fully correct in the overall as a function of time for now (it could be correct if we implement the fix for 2.). This way we make the workflow list request (mostly coming from the UI) more light (see here). get_workspace_disk_usage could be checking the DB (WorkflowResource) and if there is not calculated disk usage run du and update WorkflowResource.

Fix for 2. and going beyond full calculation in 4. are probably outside of the scope of this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 1. Addressed
  • RE 3., what do you think about creating a status change listener in the new InteractiveSession model to run update_users_disk_quota.
  • Maybe we can ticketize the other two based on your comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE 3., what do you think about creating a status change listener in the new InteractiveSession model to run update_users_disk_quota.

Tracked through here #90 (comment)

Maybe we can ticketize the other two based on your comment?

Let's create the tickets once we merge this? That would be a ticket for 2. and another for the improvement of 4., right? I can do it if you want.

@mvidalgarcia mvidalgarcia marked this pull request as ready for review October 7, 2020 16:03
@mvidalgarcia mvidalgarcia force-pushed the rwc/333-nightly-disk-update branch 2 times, most recently from 1991179 to 3bbf227 Compare October 8, 2020 14:18
@diegodelemos diegodelemos merged commit e69538a into reanahub:master Oct 9, 2020
@mvidalgarcia mvidalgarcia deleted the rwc/333-nightly-disk-update branch October 9, 2020 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

models: disk usage to be stored not calculated everytime quotas: update HDD usage quotas nightly
2 participants