Skip to content

Commit

Permalink
keep other fixes, undo credits
Browse files Browse the repository at this point in the history
  • Loading branch information
enyst committed Nov 23, 2024
1 parent 1eb6d07 commit b28439d
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion evaluation/benchmarks/gaia/scripts/run_infer.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ echo "AGENT_VERSION: $AGENT_VERSION"
echo "MODEL_CONFIG: $MODEL_CONFIG"
echo "LEVELS: $LEVELS"

COMMAND="poetry run python ./evaluation/gaia/run_infer.py \
COMMAND="poetry run python ./evaluation/benchmarks/gaia/run_infer.py \
--agent-cls $AGENT \
--llm-config $MODEL_CONFIG \
--max-iterations 30 \
Expand Down
4 changes: 2 additions & 2 deletions evaluation/benchmarks/mint/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ We support evaluation of the [Eurus subset focus on math and code reasoning](htt

## Setup Environment and LLM Configuration

Please follow instruction [here](../README.md#setup) to setup your local development environment and LLM.
Please follow instruction [here](../../README.md#setup) to setup your local development environment and LLM.

## Start the evaluation

Expand Down Expand Up @@ -34,7 +34,7 @@ Note: in order to use `eval_limit`, you must also set `subset`.
For example,

```bash
./evaluation/swe_bench/scripts/run_infer.sh eval_gpt4_1106_preview 0.6.2 gsm8k 3
./evaluation/benchmarks/mint/scripts/run_infer.sh eval_gpt4_1106_preview 0.6.2 gsm8k 3
```

## Reference
Expand Down

0 comments on commit b28439d

Please sign in to comment.