[Resolver] Unhandled RateLimitError when calling litellm.completion in issue_definitions.py #5030

neubig · 2024-11-15T16:20:52Z

Unhandled RateLimitError when calling `litellm.completion` in `issue_definitions.py`

Description

When running resolve_issue.py, the script throws multiple errors due to an unhandled RateLimitError from the Anthropic API. This occurs during the call to litellm.completion in the guess_success method of issue_definitions.py. The error indicates that the number of tokens has exceeded the per-minute rate limit imposed by the Anthropic API.

Context

Anthropic API Tier: The issue occurs on Tier 1 of the Anthropic API, which has a rate limit of 50 requests per minute.
Documentation Reference: See Anthropic API rate limits for more details.

Steps to Reproduce

Use the Anthropic API with a Tier 1 account (50 requests per minute limit).
Run OpenHands-resolver GitHub action workflow from the examples directory to resolve the issue.
Observe that the script throws multiple 429 errors RateLimitError from the LLM.

Expected Behavior

The app should handle the RateLimitError gracefully by:

Catching the exception and implementing a retry mechanism with exponential backoff or appropriate delay.
Providing a clear and user-friendly error message.
Adjusting the rate of API requests to comply with the Anthropic API rate limits.

Actual Behavior

The app crashes and outputs the following error stack trace:

Error Logs

09:27:25 - openhands:INFO: resolve_issue.py:446 - Finished.
ERROR:root:  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/openhands_resolver/resolve_issue.py", line 609, in <module>
    main()
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/openhands_resolver/resolve_issue.py", line 589, in main
    asyncio.run(
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/openhands_resolver/resolve_issue.py", line 429, in resolve_issue
    output = await process_issue(
             ^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/openhands_resolver/resolve_issue.py", line 255, in process_issue
    success, comment_success, success_explanation = issue_handler.guess_success(issue, state.history, llm_config)
                                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/openhands_resolver/issue_definitions.py", line 178, in guess_success
    response = litellm.completion(
               ^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/litellm/utils.py", line 960, in wrapper
    raise e
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/litellm/utils.py", line 849, in wrapper
    result = original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/litellm/main.py", line 3034, in completion
    raise exception_type(
          ^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2125, in exception_type
    raise e
  File "/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 490, in exception_type
    raise RateLimitError(

ERROR:root:<class 'litellm.exceptions.RateLimitError'>: litellm.RateLimitError: AnthropicException - {"type":"error","error":{"type":"rate_limit_error","message":"Number of tokens has exceeded your per-minute rate limit (https://docs.anthropic.com/en/api/rate-limits); see the response headers for current usage. Please reduce the prompt length or the maximum tokens requested, or try again later. You may also contact sales at https://www.anthropic.com/contact-sales to discuss your options for a rate limit increase."}}

Possible Solutions

We can handle this issue by implementing one or more of the following solutions:

a) Set up an environment variable for Maximum Requests Per Minute

Description: Introduce an environment variable that specifies the maximum number of requests per minute allowed for the LLM provider.
Implementation:
- Add a new environment variable, e.g., LLM_MAX_REQUESTS_PER_MINUTE.
- Modify the app to read this variable and throttle the requests accordingly.
- Use a rate limiter to ensure the number of requests does not exceed this value.

b) Configure an Environment Variable for Anthropic API Tier

Description: Set up an environment variable that represents the current Anthropic API tier (since different tiers have different rate limits).
Implementation:
- Add a new environment variable, e.g., ANTHROPIC_API_TIER.
- Map the tier to its corresponding rate limit within the app.
- Adjust the request rate based on the tier's rate limit.

c) Auto-detect Rate Limit Exceeded and Implement Retry Logic

Description: Modify the app to detect when a RateLimitError occurs and handle it gracefully.
Implementation:
- Catch the RateLimitError exception in the guess_success method.
- Implement a sleep() function to wait before retrying the request.
- Optionally use exponential backoff to increase the wait time after each retry.
- Limit the number of retries to prevent infinite loops.

Additional Context

Anthropic API Rate Limits: Refer to the Anthropic API rate limits documentation for more details.
Best Practices: Implementing these solutions will help the script comply with the API's terms of service and improve its robustness.

Please let me know if any additional information is required to resolve this issue.

Moved from All-Hands-AI/openhands-resolver#348

The text was updated successfully, but these errors were encountered:

malhotra5 · 2024-11-21T16:37:35Z

I could take a shot at this! I'm thinking of implementing Solution C but with the same schema as Openhands proper. It include:

LLM_NUM_RETRIES (Default of 8)
LLM_RETRY_MIN_WAIT (Default of 15 seconds)
LLM_RETRY_MAX_WAIT (Default of 120 seconds)
LLM_RETRY_MULTIPLIER (Default of 2)

This would help with consistency in retry methods.

enyst · 2024-11-21T16:40:22Z

Please see also: #5087

enyst · 2024-11-29T23:55:26Z

Just curious here, sorry if I'm missing something obvious, why are we using litellm.completion(), could we use our llm.completion() instead? It was intended to be compatible with litellm.completion() and it has a retry mechanism.

        response = litellm.completion(
            model=llm_config.model,
            messages=[{'role': 'user', 'content': prompt}],
            api_key=llm_config.api_key,
            base_url=llm_config.base_url,
        )

e.g.

        @self.retry_decorator(
            num_retries=self.config.num_retries,
            retry_exceptions=LLM_RETRY_EXCEPTIONS,
            retry_min_wait=self.config.retry_min_wait,
            retry_max_wait=self.config.retry_max_wait,
            retry_multiplier=self.config.retry_multiplier,
        )
        def wrapper(*args, **kwargs):
            """Wrapper for the litellm completion function. Logs the input and output of the completion function."""
...
@property
    def completion(self):
        """Decorator for the litellm completion function.

        Check the complete documentation at https://litellm.vercel.app/docs/completion
        """
        return self._completion

malhotra5 · 2024-11-30T00:05:05Z

Ah yeah I noticed this too @enyst! I've implemented your suggestions in #5187; it was approved but I didn't have push access to merge it at the time 😅

I'll try to get it in soon

github-actions · 2025-01-02T01:58:35Z

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

enyst · 2025-01-02T04:04:30Z

I think this has been addressed by reusing the llm.completion method, which obeys the user configuration (like min_retry).

The rest of the problem here is tracked in other issues, for example on implementing an automated routing mechanism or other features that would improve the behavior with rate limits. (example)

I'll close this, but please feel free to reopen if you see fit.

mamoodi added bug Something isn't working resolver Related to OpenHands Resolver labels Nov 15, 2024

FabianHertwig mentioned this issue Nov 16, 2024

[Resolver] Pass retry logic environment variables in GitHub Action #5087

Closed

malhotra5 mentioned this issue Nov 21, 2024

[Resolver] API Retry on guess success #5187

Merged

1 task

github-actions bot added the Stale Inactive for 30 days label Jan 2, 2025

enyst closed this as completed Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Resolver] Unhandled RateLimitError when calling litellm.completion in issue_definitions.py #5030

[Resolver] Unhandled RateLimitError when calling litellm.completion in issue_definitions.py #5030

neubig commented Nov 15, 2024

malhotra5 commented Nov 21, 2024 •

edited

Loading

enyst commented Nov 21, 2024

enyst commented Nov 29, 2024

malhotra5 commented Nov 30, 2024

github-actions bot commented Jan 2, 2025

enyst commented Jan 2, 2025

[Resolver] Unhandled RateLimitError when calling litellm.completion in issue_definitions.py #5030

[Resolver] Unhandled RateLimitError when calling litellm.completion in issue_definitions.py #5030

Comments

neubig commented Nov 15, 2024

Unhandled RateLimitError when calling litellm.completion in issue_definitions.py

a) Set up an environment variable for Maximum Requests Per Minute

b) Configure an Environment Variable for Anthropic API Tier

c) Auto-detect Rate Limit Exceeded and Implement Retry Logic

malhotra5 commented Nov 21, 2024 • edited Loading

enyst commented Nov 21, 2024

enyst commented Nov 29, 2024

malhotra5 commented Nov 30, 2024

github-actions bot commented Jan 2, 2025

enyst commented Jan 2, 2025

Unhandled RateLimitError when calling `litellm.completion` in `issue_definitions.py`

malhotra5 commented Nov 21, 2024 •

edited

Loading