Skip to content

Commit

Permalink
fix: workflow index error (#173)
Browse files Browse the repository at this point in the history
  • Loading branch information
dongyuanjushi authored Jul 16, 2024
1 parent 16152da commit f1d9685
Show file tree
Hide file tree
Showing 2 changed files with 135 additions and 3 deletions.
132 changes: 132 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,138 @@ Please see our ongoing [documentation](https://aios.readthedocs.io/en/latest/) f
- [Installation](https://aios.readthedocs.io/en/latest/get_started/installation.html)
- [Quickstart](https://aios.readthedocs.io/en/latest/get_started/quickstart.html)

### Installation

Git clone AIOS
```bash
git clone https://github.com/agiresearch/AIOS.git
```
```bash
conda create -n AIOS python=3.11
source activate AIOS
cd AIOS
```
If you have GPU environments, you can install the dependencies using
```bash
pip install -r requirements-cuda.txt
```
or else you can install the dependencies using
```bash
pip install -r requirements.txt
```

### Quickstart

#### Use with OpenAI API
You need to get your OpenAI API key from https://platform.openai.com/api-keys.
Then set up your OpenAI API key as an environment variable

```bash
export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
```

Then run main.py with the models provided by OpenAI API

```python
python main.py --llm_name gpt-3.5-turbo # use gpt-3.5-turbo for example
```

#### Use with Gemini API
You need to get your Gemini API key from https://ai.google.dev/gemini-api

```bash
export GEMINI_API_KEY=<YOUR_GEMINI_API_KEY>
```

Then run main.py with the models provided by OpenAI API

```python
python main.py --llm_name gemini-1.5-flash # use gemini-1.5-flash for example
```

If you want to use **open-sourced** models provided by huggingface, here we provide three options:
* Use with ollama
* Use with native huggingface models
* Use with vllm

#### Use with ollama
You need to download ollama from from https://ollama.com/.

Then you need to start the ollama server either from ollama app

or using the following command in the terminal

```bash
ollama serve
```

To use models provided by ollama, you need to pull the available models from https://ollama.com/library

```bash
ollama pull llama3:8b # use llama3:8b for example
```

ollama can support CPU-only environment, so if you do not have CUDA environment

You can run aios with ollama models by

```python
python main.py --llm_name ollama/llama3:8b --use_backend ollama # use ollama/llama3:8b for example
```

However, if you have the GPU environment, you can also pass GPU-related parameters to speed up
using the following command

```python
python main.py --llm_name ollama/llama3:8b --use_backend ollama --max_gpu_memory '{"0": "24GB"}' --eval_device "cuda:0" --max_new_tokens 256
```

#### Use with native huggingface llm models
Some of the huggingface models require authentification, if you want to use all of
the models you need to set up your authentification token in https://huggingface.co/settings/tokens
and set up it as an environment variable using the following command

```bash
export HF_AUTH_TOKENS=<YOUR_TOKEN_ID>
```

You can run with the

```python
python main.py --llm_name meta-llama/Meta-Llama-3-8B-Instruct --max_gpu_memory '{"0": "24GB"}' --eval_device "cuda:0" --max_new_tokens 256
```

By default, huggingface will download the models in the `~/.cache` directory.
If you want to designate the download directory, you can set up it using the following command

```bash
export HF_HOME=<YOUR_HF_HOME>
```

#### Use with vllm
If you want to speed up the inference of huggingface models, you can use vllm as the backend.

💡Note: It is important to note that vllm currently only supports linux and GPU-enabled environment. So if you do not have the environment, you need to choose other options.

Considering that vllm itself does not support passing designated GPU ids, you need to either
setup the environment variable,

```bash
export CUDA_VISIBLE_DEVICES="0" # replace with your designated gpu ids
```

Then run the command

```python
python main.py --llm_name meta-llama/Meta-Llama-3-8B-Instruct --use_backend vllm --max_gpu_memory '{"0": "24GB"}' --eval_device "cuda:0" --max_new_tokens 256
```

or you can pass the `CUDA_VISIBLE_DEVICES` as the prefix

```python
CUDA_VISIBLE_DEVICES=0 python main.py --llm_name meta-llama/Meta-Llama-3-8B-Instruct --use_backend vllm --max_gpu_memory '{"0": "24GB"}' --eval_device "cuda:0" --max_new_tokens 256
```

### Supported LLM Endpoints
- [OpenAI API](https://platform.openai.com/api-keys)
- [Gemini API](https://ai.google.dev/gemini-api)
Expand Down
6 changes: 3 additions & 3 deletions pyopenagi/agents/react_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ def run(self):
message = step["message"]
tool_use = step["tool_use"]

prompt = f"At step {self.rounds + 1}, you need to {message}. "
prompt = f"At step {i + 1}, you need to {message}. "
self.messages.append({
"role": "user",
"content": prompt
Expand All @@ -174,7 +174,7 @@ def run(self):
self.request_turnaround_times.extend(turnaround_times)

if tool_calls:
for i in range(self.plan_max_fail_times):
for _ in range(self.plan_max_fail_times):
actions, observations, success = self.call_tools(tool_calls=tool_calls)

action_messages = "[Action]: " + ";".join(actions)
Expand All @@ -198,7 +198,7 @@ def run(self):
if i == len(workflow) - 1:
final_result = self.messages[-1]

self.logger.log(f"At step {self.rounds + 1}, {self.messages[-1]}\n", level="info")
self.logger.log(f"At step {i + 1}, {self.messages[-1]}\n", level="info")

self.rounds += 1

Expand Down

0 comments on commit f1d9685

Please sign in to comment.