Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

429 Rate Limiting #187

Open
doougal opened this issue Oct 23, 2023 · 5 comments
Open

429 Rate Limiting #187

doougal opened this issue Oct 23, 2023 · 5 comments
Labels
Usage Support Helping a user with usage

Comments

@doougal
Copy link

doougal commented Oct 23, 2023

Describe the issue
A clear and concise description of what the bug is.

Version details
AutoPilot version tested: f1b706c
Node version: v18.12.1

Task input
Task: Add a function declaration to the controller class (cinema), to create a Notification for a Customer

Logs
What did you get on screen:
The following was repeated many times (I assume one for each file)
Error processing file: model/screening.py AxiosError: Request failed with status code 429
at createError (C:\Users\d0oug\Documents\WORKSPACE_L\autopilot\node_modules\langchain\dist\util\axios-fetch-adapter.cjs:317:16)
at settle (C:\Users\d0oug\Documents\WORKSPACE_L\autopilot\node_modules\langchain\dist\util\axios-fetch-adapter.cjs:31:16)
at C:\Users\d0oug\Documents\WORKSPACE_L\autopilot\node_modules\langchain\dist\util\axios-fetch-adapter.cjs:181:19
at new Promise ()
at fetchAdapter (C:\Users\d0oug\Documents\WORKSPACE_L\autopilot\node_modules\langchain\dist\util\axios-fetch-adapter.cjs:173:12)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
config: {
transitional: {
silentJSONParsing: true,
forcedJSONParsing: true,
clarifyTimeoutError: false
},
adapter: [AsyncFunction: fetchAdapter],
transformRequest: [ [Function: transformRequest] ],
transformResponse: [ [Function: transformResponse] ],
timeout: 0,
xsrfCookieName: 'XSRF-TOKEN',
xsrfHeaderName: 'X-XSRF-TOKEN',
maxContentLength: -1,
maxBodyLength: -1,
validateStatus: [Function: validateStatus],
headers: {
Accept: 'application/json, text/plain, /',
'Content-Type': 'application/json',
'User-Agent': 'OpenAI/NodeJS/3.2.1',
Authorization: 'Bearer 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'
},
method: 'post',
data: '{"model":"gpt-3.5-turbo","temperature":0.01,"top_p":1,"frequency_penalty":0,"presence_penalty":0,"n":1,"max_tokens":null,"stream":false,"messages":[{"role":"user","content":"\nTASK: Create a summary of the file below. Use as few words as possible while keeping the details. Use bullet points.\n\nThe output should be a markdown code snippet formatted in the following schema:\n\njson\\n{\\n\\t\\"thoughts\\": {\\n\\t\\t\\"text\\": string // your thoughts\\n\\t\\t\\"reasoning\\": string // your reasoning\\n\\t\\t\\"criticism\\": string // constructive self-criticism\\n\\t}\\n\\t\\"output\\": {\\n\\t\\t\\"summary\\": string // summary of the file content\\n\\t\\t\\"functions\\": string[] // functions in the file\\n\\t\\t\\"keywords\\": {\\n\\t\\t\\t\\"term\\": string // the term\\n\\t\\t\\t\\"definition\\": string // explanation of the term in the code\'s context\\n\\t\\t}[] // What are business-level terminologies and keywords we can learn from the code?\\n\\t\\t\\"dependenciesLibs\\": string[] // what libraries and/or files does this file depend on?\\n\\t}\\n}\\n\n\n\nmodel/screening.py\n```\nfrom datetime import datetime\r\nfrom model.movie import Movie\r\nfrom model.screening_hall import ScreeningHall\r\nfrom model.seat import Seat\r\n\r\n##\r\n# @file screening.py\r\n# @brief This file contains the Screening model class.\r\n#\r\n\r\nclass Screening:\r\n \"\"\"\r\n @brief Represents a screening of a movie in a cinema hall.\r\n\r\n This class defines attributes and methods related to movie screenings.\r\n\r\n @param movie (Movie): The movie being screened.\r\n @param date_time (datetime): The date and time of the screening.\r\n @param hall (ScreeningHall): The cinema hall where the
screening takes place.\r\n \"\"\"\r\n\r\n def init(self, movie: Movie, date_time: datetime, hall: ScreeningHall):\r\n \"\"\"\r\n @brief Initializes a new Screening instance with the provided details.\r\n\r\n
@param movie (Movie): The movie being screened.\r\n @param date_time (datetime): The date and time of the screening.\r\n @param hall (ScreeningHall): The cinema hall where the screening takes place.\r\n \"\"\"\r\n

The movie being screened\r\n self.movie = movie\r\n ## The date and time of the screening\r\n self.date_time = date_time\r\n ## The Cinema hall where the screening takes place\r\n self.hall = hall\r\n

A list of available seats for the screening\r\n self.seats = hall.get_seats()[:] # Use slicing to create a copy of the list of seats\r\n\r\n def get_movie(self) -> Movie:\r\n \"\"\"\r\n @brief Returns the movie being screened.\r\n\r\n @return The movie being screened.\r\n \"\"\"\r\n return self.movie\r\n \r\n def set_movie(self, movie: Movie) -> None:\r\n \"\"\"\r\n @brief Sets the movie being screened.\r\n\r\n @param movie (Movie): The new movie being screened.\r\n \"\"\"\r\n self.movie = movie\r\n\r\n def get_date_time(self) -> datetime:\r\n \"\"\"\r\n @brief Returns the date and time of the screening.\r\n\r\n @return The date and time of the screening.\r\n \"\"\"\r\n return self.date_time\r\n \r\n def set_date_time(self, date_time: datetime) -> None:\r\n \"\"\"\r\n @brief Sets the date and time of the screening.\r\n\r\n @param date_time (datetime): The new date and time of the screening.\r\n \"\"\"\r\n self.date_time = date_time\r\n\r\n def get_hall(self) -> ScreeningHall:\r\n \"\"\"\r\n @brief Returns the cinema hall where the screening takes place.\r\n\r\n @return The cinema hall where the screening takes place.\r\n \"\"\"\r\n return self.hall\r\n \r\n def set_hall(self, hall: ScreeningHall) -> None:\r\n \"\"\"\r\n @brief Sets the cinema hall where the screening takes place.\r\n\r\n @param hall (ScreeningHall): The new cinema hall where the screening takes place.\r\n \"\"\"\r\n self.hall = hall\r\n\r\n def get_seats(self) -> list[Seat]:\r\n \"\"\"\r\n @brief Returns the list of seats for the screening.\r\n\r\n @return The list of seats for the screening.\r\n \"\"\"\r\n return self.seats\r\n \r\n def add_seat(self, seat: Seat) -> None:\r\n \"\"\"\r\n @brief Adds a new seat to the screening.\r\n\r\n @param seat (Seat): The new seat to be added to the screening.\r\n \"\"\"\r\n self.seats.append(seat)\r\n\r\n def remove_seat(self, seat: Seat) -> None:\r\n \"\"\"\r\n @brief Removes a seat from the screening.\r\n\r\n @param seat (Seat): The seat to be removed from the screening.\r\n \"\"\"\r\n self.seats.remove(seat)\n```\n"}]}',

url: 'https://api.openai.com/v1/chat/completions'

},
request: Request {
[Symbol(realm)]: { settingsObject: [Object] },
[Symbol(state)]: {
method: 'POST',
localURLsOnly: false,
unsafeRequest: false,
body: [Object],
client: [Object],
reservedClient: null,
replacesClientId: '',
window: 'client',
keepalive: false,
serviceWorkers: 'all',
initiator: '',
destination: '',
priority: null,
origin: 'client',
policyContainer: 'client',
referrer: 'client',
referrerPolicy: '',
mode: 'cors',
useCORSPreflightFlag: false,
credentials: 'same-origin',
useCredentials: false,
cache: 'default',
redirect: 'follow',
integrity: '',
cryptoGraphicsNonceMetadata: '',
parserMetadata: '',
reloadNavigation: false,
historyNavigation: false,
userActivation: false,
taintedOrigin: false,
redirectCount: 0,
responseTainting: 'basic',
preventNoCacheCacheControlHeaderModification: false,
done: false,
timingAllowFailed: false,
headersList: [HeadersList],
urlList: [Array],
url: [URL]
},
[Symbol(signal)]: AbortSignal { aborted: false },
[Symbol(headers)]: HeadersList {
[Symbol(headers map)]: [Map],
[Symbol(headers map sorted)]: null
}
},
response: {
ok: false,
status: 429,
statusText: 'Too Many Requests',
headers: HeadersList {
[Symbol(headers map)]: [Map],
[Symbol(headers map sorted)]: null
},
config: {
transitional: [Object],
adapter: [AsyncFunction: fetchAdapter],
transformRequest: [Array],
transformResponse: [Array],
timeout: 0,
xsrfCookieName: 'XSRF-TOKEN',
xsrfHeaderName: 'X-XSRF-TOKEN',
maxContentLength: -1,
maxBodyLength: -1,
validateStatus: [Function: validateStatus],
headers: [Object],
method: 'post',
data: '{"model":"gpt-3.5-turbo","temperature":0.01,"top_p":1,"frequency_penalty":0,"presence_penalty":0,"n":1,"max_tokens":null,"stream":false,"messages":[{"role":"user","content":"\nTASK: Create a summary of the file below. Use as few words as possible while keeping the details. Use bullet points.\n\nThe output should be a markdown code snippet formatted in the following schema:\n\njson\\n{\\n\\t\\"thoughts\\": {\\n\\t\\t\\"text\\": string // your thoughts\\n\\t\\t\\"reasoning\\": string // your reasoning\\n\\t\\t\\"criticism\\": string // constructive self-criticism\\n\\t}\\n\\t\\"output\\": {\\n\\t\\t\\"summary\\": string // summary of the file content\\n\\t\\t\\"functions\\": string[] // functions in the file\\n\\t\\t\\"keywords\\": {\\n\\t\\t\\t\\"term\\": string // the term\\n\\t\\t\\t\\"definition\\": string // explanation of the term in the code\'s context\\n\\t\\t}[] // What are business-level terminologies and keywords we can learn from the code?\\n\\t\\t\\"dependenciesLibs\\": string[] // what libraries and/or files does this file depend on?\\n\\t}\\n}\\n\n\n\nmodel/screening.py\n```\nfrom datetime import datetime\r\nfrom model.movie import Movie\r\nfrom model.screening_hall import ScreeningHall\r\nfrom model.seat import
Seat\r\n\r\n##\r\n# @file screening.py\r\n# @brief This file contains the Screening model class.\r\n#\r\n\r\nclass Screening:\r\n \"\"\"\r\n @brief Represents a screening of a movie in a cinema hall.\r\n\r\n This class defines attributes and methods related to movie screenings.\r\n\r\n @param movie (Movie): The movie being screened.\r\n @param date_time (datetime): The date and time of the screening.\r\n @param hall (ScreeningHall): The cinema hall where the screening takes place.\r\n \"\"\"\r\n\r\n def init(self, movie: Movie, date_time: datetime, hall: ScreeningHall):\r\n \"\"\"\r\n @brief Initializes a new Screening instance with the provided details.\r\n\r\n
@param movie (Movie): The movie being screened.\r\n @param date_time (datetime): The date and time of the screening.\r\n @param hall (ScreeningHall): The cinema hall where the screening takes place.\r\n \"\"\"\r\n

The movie being screened\r\n self.movie = movie\r\n ## The date and time of the screening\r\n self.date_time = date_time\r\n ## The Cinema hall where the screening takes place\r\n self.hall = hall\r\n

## A list of available seats for the screening\\r\\n        self.seats = hall.get_seats()[:] # Use slicing to create a copy of the list of seats\\r\\n\\r\\n    def get_movie(self) -> Movie:\\r\\n        \\"\\"\\"\\r\\n        @brief Returns the movie being screened.\\r\\n\\r\\n        @return The movie being screened.\\r\\n        \\"\\"\\"\\r\\n        return self.movie\\r\\n    \\r\\n    def set_movie(self, movie: Movie) -> None:\\r\\n        \\"\\"\\"\\r\\n        @brief Sets the movie being screened.\\r\\n\\r\\n        @param movie (Movie): The new movie being screened.\\r\\n        \\"\\"\\"\\r\\n        self.movie = movie\\r\\n\\r\\n    def get_date_time(self) -> datetime:\\r\\n        \\"\\"\\"\\r\\n        @brief Returns the date and time of the 

screening.\r\n\r\n @return The date and time of the screening.\r\n \"\"\"\r\n return self.date_time\r\n \r\n def set_date_time(self, date_time: datetime) -> None:\r\n \"\"\"\r\n @brief Sets the date and time of the screening.\r\n\r\n @param date_time (datetime): The new date and time of the screening.\r\n \"\"\"\r\n self.date_time = date_time\r\n\r\n def get_hall(self) -> ScreeningHall:\r\n \"\"\"\r\n @brief Returns the cinema hall where the screening takes place.\r\n\r\n @return The cinema hall where the screening takes place.\r\n \"\"\"\r\n return self.hall\r\n \r\n def set_hall(self, hall: ScreeningHall) -> None:\r\n \"\"\"\r\n @brief Sets the cinema hall where the screening takes place.\r\n\r\n @param hall (ScreeningHall): The new cinema hall where the screening takes place.\r\n \"\"\"\r\n self.hall = hall\r\n\r\n def get_seats(self) -> list[Seat]:\r\n \"\"\"\r\n @brief Returns the list of seats for the screening.\r\n\r\n @return The list of seats for the screening.\r\n \"\"\"\r\n return self.seats\r\n \r\n def add_seat(self, seat: Seat) -> None:\r\n \"\"\"\r\n @brief Adds a new seat to the screening.\r\n\r\n @param seat (Seat): The new seat to be added to the screening.\r\n \"\"\"\r\n self.seats.append(seat)\r\n\r\n def remove_seat(self, seat: Seat) -> None:\r\n \"\"\"\r\n @brief Removes a seat from the screening.\r\n\r\n @param seat (Seat): The seat to be removed from the screening.\r\n \"\"\"\r\n self.seats.remove(seat)\n```\n"}]}',
url: 'https://api.openai.com/v1/chat/completions'
},
request: Request {
[Symbol(realm)]: [Object],
[Symbol(state)]: [Object],
[Symbol(signal)]: [AbortSignal],
[Symbol(headers)]: [HeadersList]
},
data: { error: [Object] }
}
}
Task: Add a function declaration to the controller class (cinema), to create a Notification for a Customer
No matching files found in the database. Indexing is required.
C:\Users\d0oug\Documents\WORKSPACE_L\autopilot\modules\summaries.js:79
throw new Error("Cannot run without summaries. Indexing is required.");
^

Error: Cannot run without summaries. Indexing is required.
at readAllSummaries (C:\Users\d0oug\Documents\WORKSPACE_L\autopilot\modules\summaries.js:79:11)
at async getSummaries (C:\Users\d0oug\Documents\WORKSPACE_L\autopilot\modules\summaries.js:100:21)
at async main (C:\Users\d0oug\Documents\WORKSPACE_L\autopilot\ui.js:65:21)

The content of the last file under the logs directory:
No files in the logs directory except for the default DO_NOT_REMOVE
This might be a bug in itself, where log files aren't created when indexing?

Expected behavior
I have yet to successfully use autopilot, so I'm unsure
Completing the task though

Additional context
I have not ever queried the openai API before
This happened on the first run, maybe too many requests per second? Or an issue on openai's side?

@doougal doougal added the Usage Support Helping a user with usage label Oct 23, 2023
@codeautopilot
Copy link

codeautopilot bot commented Oct 23, 2023

Implementation plan

The error message indicates that the OpenAI API is returning a 429 status code, which means that too many requests are being sent in a short amount of time. This is known as rate limiting.

To solve this issue, we can implement a rate limiting strategy in the callGPT function in modules/gpt.js, the suggestChanges function in agents/coder.js, and the main function in ui.js. This can be done by introducing a delay between each API call.

Code

In modules/gpt.js:

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

const callGPT = async (prompt, model) => {
  // existing code...

  try {
    // introduce delay before API call
    await sleep(1000); // delay of 1 second

    const completion = await openai.createChatCompletion({
      // existing code...
    });

    // existing code...
  } catch (error) {
    console.log(error.response)
  }
};

In agents/coder.js:

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

async function suggestChanges(task, file) {
  const code = formatCode(file)

  const model = getModel(process.env.INDEXER_MODEL);

  const input = await prompt.format({ task, code });
  
  // Introduce delay before making the API request
  await sleep(1000); // sleep for 1000 milliseconds (1 second)

  const response = await model.call(input);

  // rest of the code...
}

In ui.js:

const delay = ms => new Promise(res => setTimeout(res, ms));

async function main(task, test=false, suggestionMode) {
  // ...
  let solutions = [];
  await Promise.all(files.map(async (file, index) => {
    // Add a delay before each request
    await delay(index * 1000);  // Adjust the delay as needed

    // For each file, ask the coder agent for a solution
    // ...
  }));
  // ...
}

In the above code, the sleep function is used to pause execution for a specified number of milliseconds. Before making the API call, we pause for 1 second. This will ensure that we do not exceed the rate limit of the API.

Please note that the actual delay needed may vary depending on the rate limit set by the OpenAI API. You may need to adjust the delay to fit within the rate limit.

Changes on modules/gpt.js

The error message indicates that the OpenAI API is returning a 429 status code, which means that too many requests are being sent in a short amount of time. This is known as rate limiting.

To solve this issue, you can implement a rate limiting strategy in the callGPT function. This can be done by introducing a delay between each API call.

Here is a simple way to introduce a delay:

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

const callGPT = async (prompt, model) => {
  // existing code...

  try {
    // introduce delay before API call
    await sleep(1000); // delay of 1 second

    const completion = await openai.createChatCompletion({
      // existing code...
    });

    // existing code...
  } catch (error) {
    console.log(error.response)
  }
};

In the above code, the sleep function is used to pause execution for a specified number of milliseconds. Before making the API call, we pause for 1 second. This will ensure that we do not exceed the rate limit of the API.

Please note that the actual delay needed may vary depending on the rate limit set by the OpenAI API. You may need to adjust the delay to fit within the rate limit.

Changes on agents/coder.js

The error message indicates that the rate limit for the OpenAI API has been exceeded. This is likely due to the 'suggestChanges' function in the 'agents/coder.js' file making too many requests to the API in a short period of time.

To solve this issue, we can implement a rate limiting strategy in the 'suggestChanges' function. This can be done by introducing a delay between consecutive API requests. Here's a simple way to do it using async/await and a Promise that resolves after a set timeout:

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

async function suggestChanges(task, file) {
  const code = formatCode(file)

  const model = getModel(process.env.INDEXER_MODEL);

  const input = await prompt.format({ task, code });
  
  // Introduce delay before making the API request
  await sleep(1000); // sleep for 1000 milliseconds (1 second)

  const response = await model.call(input);

  // rest of the code...
}

This will ensure that there is at least a 1-second delay between consecutive API requests, which should prevent the rate limit from being exceeded.

Please note that the actual delay may need to be adjusted based on the specific rate limits of the OpenAI API. It's also worth noting that this is a simple rate limiting strategy and may not be suitable if the 'suggestChanges' function is being called concurrently from multiple places. In such cases, a more sophisticated rate limiting strategy may be required.

Changes on ui.js

The error message indicates that the OpenAI API's rate limit has been exceeded. This is likely due to too many requests being made in a short period of time.

To solve this issue, you could implement a rate limiting strategy in the 'main' function. This could involve adding a delay between each request to the OpenAI API, ensuring that the number of requests does not exceed the API's rate limit within a certain time period.

Here's a simple example of how you could implement this:

const delay = ms => new Promise(res => setTimeout(res, ms));

async function main(task, test=false, suggestionMode) {
  // ...
  let solutions = [];
  await Promise.all(files.map(async (file, index) => {
    // Add a delay before each request
    await delay(index * 1000);  // Adjust the delay as needed

    // For each file, ask the coder agent for a solution
    // ...
  }));
  // ...
}

This will add a delay before each request, increasing with each file. Adjust the delay as needed to stay within the API's rate limits.

Please note that this is a simple rate limiting strategy and may not be suitable for all use cases. For a more robust solution, you might want to consider using a library or service that provides more advanced rate limiting features.


Current plan usage: 81.74%


Have feedback or need help?

Discord
Documentation
[email protected]

@codeautopilot
Copy link

codeautopilot bot commented Oct 23, 2023

🤖 I'm working on a solution for this issue. Please don't create new issues or edit this one until I reply back.


Current plan usage: 50.50%


Have feedback or need help?

Discord
Documentation
[email protected]

@doougal
Copy link
Author

doougal commented Oct 24, 2023

I've fixed this by crediting my openai account.
The app must send requests too quickly for somebody in free tier to parse
I think it would be a good feature enhancement to decrease request size / speed so that free users could use this too

@ishaan-jaff
Copy link

@doougal @jhgorse

I'm the maintainer of LiteLLM we provide an Open source proxy for load balancing Azure + OpenAI + Any LiteLLM supported LLM
It can process (500+ requests/second)

From this thread it looks like you're trying to manage rate limits + load balance between Azure OpenAI instance - I hope our solution makes it easier for you. (i'd love feedback if you're trying to do this)

Here's the quick start:

Doc: https://docs.litellm.ai/docs/simple_proxy#load-balancing---multiple-instances-of-1-model

Step 1 Create a Config.yaml

model_list:
  - model_name: gpt-4
    litellm_params:
      model: azure/chatgpt-v-2
      api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
      api_version: "2023-05-15"
      api_key: 
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_key: 
      api_base: https://openai-gpt-4-test-v-2.openai.azure.com/
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_key: 
      api_base: https://openai-gpt-4-test-v-2.openai.azure.com/

Step 2: Start the litellm proxy:

litellm --config /path/to/config.yaml

Step3 Make Request to LiteLLM proxy:

curl --location 'http://0.0.0.0:8000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
      "model": "gpt-4",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ],
    }
'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Usage Support Helping a user with usage
Projects
None yet
Development

No branches or pull requests

3 participants