Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finished but no scores. #1677

Open
johanneskruse opened this issue Nov 20, 2024 · 5 comments
Open

Finished but no scores. #1677

johanneskruse opened this issue Nov 20, 2024 · 5 comments
Labels
Competition-specific Problem specific to a given competition or benchmark Setup Anything related to the deployment of CodaLab

Comments

@johanneskruse
Copy link

Hi,

I’ve encountered the issue of my submissions finishing but not showing the scores:
Screenshot 2024-11-19 at 18 32 57

When looking at the error logs in Docker, I get the following:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 650, in __protected_call__
    return self.run(*args, **kwargs)
  File "/compute_worker.py", line 113, in run_wrapper
    run.start()
  File "/compute_worker.py", line 879, in start
    self._update_status(STATUS_FINISHED)
  File "/compute_worker.py", line 356, in _update_status
    self._update_submission(data)
  File "/compute_worker.py", line 334, in _update_submission
    resp = self.requests_session.patch(url, data, timeout=150)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 596, in patch
    return self.request('PATCH', url, data=data, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 524, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 637, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='www.codabench.org', port=443): Read timed out. (read timeout=150)

It dumps the detailed report, so I’m able to see this. It’s somewhat random, and my solution has been to resubmit it repeatedly until I don’t get a timeout. The error indicates that your PATCH request to www.codabench.org is not receiving a response within the specified timeout of 150 seconds?

Best,
Johannes

@Didayolo
Copy link
Member

Hi @johanneskruse,

Are you using the default queue or a custom queue of compute workers?
Can you share the URL of the competition?

@Didayolo Didayolo added the Competition-specific Problem specific to a given competition or benchmark label Nov 20, 2024
@johanneskruse
Copy link
Author

Hi @Didayolo,

I'm using a custom queue. Here's the link to the competition:

@Didayolo Didayolo added the Setup Anything related to the deployment of CodaLab label Nov 22, 2024
@Didayolo
Copy link
Member

Looks like a problem with the configuration of the compute workers.

Are your machine behind a firewall, inside a private network, or so? Do you have access to opening or closing ports? Do you think that the issue could comes from that?

@johanneskruse
Copy link
Author

johanneskruse commented Nov 25, 2024

Thank you for the suggestions. We are running on AWS, and there shouldn't be any firewall issues, and full access should be available. Some submissions are returning within the timeout period, but others aren't. Hence, the response times seem to vary a bit on our end. But it seems to be working.

I looked into the _update_submission(), where the error seems to stem from. I found the following:

resp = self.requests_session.patch(url, data, timeout=150)

If I understand this correctly, the timeout=150 means that the client will wait up to 150 seconds for the server to respond before timing out.

Could this be the issue? And if so, could it be increased?

@Didayolo
Copy link
Member

Didayolo commented Dec 10, 2024

Apparently, this problem may happen when the queue is congested, meaning that many submissions are stacking up in the "Submitted" status. Then the platform is not able to update the status of submissions.

  • Do you have your own instance of the platform?
  • If so, did you try to restart everything before retrying?

I hope this can help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Competition-specific Problem specific to a given competition or benchmark Setup Anything related to the deployment of CodaLab
Projects
None yet
Development

No branches or pull requests

2 participants