This folder contains inference scripts for the MMMU-Pro dataset.
infer_xxx.py
: For model inferenceevaluate.py
: For evaluating inference results
Make sure to configure the necessary model and data files before use.
This script loads a specified model and performs inference. To run the script, use the following steps:
cd mmmu-pro
python infer/infer_xxx.py [MODEL_NAME] [MODE] [SETTING]
[MODEL_NAME]
: Specify the model's name (e.g.,gpt-4o
). Ensure the corresponding model files are available in the required directory.[MODE]
: Choose the prompt mode:cot
(Chain of Thought): The model processes the problem step-by-step.direct
: The model directly provides the answer.
[SETTING]
: Select the inference task setting:standard(10 options)
: Uses the standard format of augmented MMMU with ten options.standard(4 options)
: Uses the standard format of augmented MMMU with four options.vision
: Uses a screenshot or photo form of augmented MMMU.
Example:
python infer/infer_gpt.py gpt-4o cot vision
This example runs the gpt-4o
model in chain-of-thought (cot
) mode using the vision
setting of augmented MMMU. The inference results will be saved to the ./output
directory.
This script evaluates the results generated from the inference step. To run the evaluation, use the following command:
cd mmmu-pro
python evaluate.py
Once executed, the script will:
- Load the inference results from the
./output
directory. - Generate and display the evaluation report in the console.
- Save the evaluation report to the
./output
directory.
- Make sure the model and data files are properly configured before running the scripts.
- To adjust parameters, edit the relevant sections in the script files as needed.