Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workflow model status_listener, get_disk_usage_info_paths fails, du command cannot find files #303

Open
VMois opened this issue Oct 7, 2021 · 0 comments
Labels
type/bug Something isn't working

Comments

@VMois
Copy link

VMois commented Oct 7, 2021

def get_disk_usage_info_paths(absolute_path, command, name_filter):

This issue is most probably responsible for reanahub/reana-workflow-engine-yadage#202 . It also relates to reana-db and reana-workflow-controller repositories. The error is starting in job-status-consumer when job-status messages are processed under high-load on local cluster installation.

All logs below are from job-status-consumer. The error happened with workflow 3580ae6d-f381-479b-babc-3a5033a70605 when workflow 91d19f41-fd8b-4f2b-af79-e6a2ab18e491 state was changing to finished.

Logs before error:

...
2021-10-06 14:54:48,128 | root | MainThread | INFO |  [x] Received workflow_uuid: 3580ae6d-f381-479b-babc-3a5033a70605 status: RunStatus.running
...

Error:

...
2021-10-06 14:56:35,497 | root | MainThread | INFO |  [x] Received workflow_uuid: 91d19f41-fd8b-4f2b-af79-e6a2ab18e491 status: RunStatus.finished
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_23.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_36.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_19.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_30.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_08.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_25.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_34.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_12.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_01.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_37.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_40.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_27.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_07.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_00.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_09.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_11.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_10.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_17.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_20.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_06.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_24.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_26.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_38.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_33.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_35.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_22.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_15.png': No such file or directory
2021-10-06 14:56:35,607 | root | MainThread | ERROR | Unexpected error while processing workflow: Command '['du', '-s', '-b', '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/reana_workflow_controller/consumer.py", line 100, in on_message
    _update_workflow_status(workflow, next_status, logs)
  File "/usr/local/lib/python3.8/site-packages/reana_workflow_controller/consumer.py", line 138, in _update_workflow_status
    Workflow.update_workflow_status(Session, workflow.id_, status, logs, None)
  File "/code/modules/reana-db/reana_db/models.py", line 658, in update_workflow_status
    raise e
  File "/code/modules/reana-db/reana_db/models.py", line 653, in update_workflow_status
    workflow.status = status
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/attributes.py", line 279, in __set__
    self.impl.set(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/attributes.py", line 872, in set
    value = self.fire_replace_event(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/attributes.py", line 880, in fire_replace_event
    value = fn(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/events.py", line 2174, in wrap
    fn(target, *arg)
  File "/code/modules/reana-db/reana_db/models.py", line 715, in workflow_status_change_listener
    _update_disk_quota(workflow)
  File "/code/modules/reana-db/reana_db/models.py", line 687, in _update_disk_quota
    update_users_disk_quota(user=workflow.owner)
  File "/code/modules/reana-db/reana_db/utils.py", line 242, in update_users_disk_quota
    disk_usage_bytes = get_disk_usage_or_zero(workspace_path)
  File "/code/modules/reana-db/reana_db/utils.py", line 253, in get_disk_usage_or_zero
    disk_bytes = get_disk_usage(workspace_path, summarize=True)
  File "/code/modules/reana-commons/reana_commons/utils.py", line 306, in get_disk_usage
    disk_usage_info = get_disk_usage_info_paths(directory, command, name_filter)
  File "/code/modules/reana-commons/reana_commons/utils.py", line 271, in get_disk_usage_info_paths
    disk_usage_info = subprocess.check_output(command).decode().split()
  File "/usr/local/lib/python3.8/subprocess.py", line 415, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/local/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['du', '-s', '-b', '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows']' returned non-zero exit status 1.
2021-10-06 14:56:35,684 | root | MainThread | INFO |  [x] Received workflow_uuid: e643fb12-a75e-46ad-9135-b2b0362b2695 status: RunStatus.finished
...

After error:

...
2021-10-06 14:56:40,413 | root | MainThread | INFO |  [x] Received workflow_uuid: 3580ae6d-f381-479b-babc-3a5033a70605 status: RunStatus.running
2021-10-06 14:56:40,559 | root | MainThread | INFO |  [x] Received workflow_uuid: 3580ae6d-f381-479b-babc-3a5033a70605 status: RunStatus.finished
...

Side note, the error is caught in job-status-consumer but because the try/catch block has a huge scope the _delete_pod... function is never reached and we lost the message, hence the run-batch stays in the NotReady state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant