GitHub - Yunkaidang/uncertainty: Exploring Response Uncertainty in MLLMs: An Empirical Evaluation Under Misleading Inputs

Exploring Response Uncertainty in MLLMs : An Empirical Evaluation Under Misleading Scenarios

📌 Key Features

🔍 Evaluation of MLLMs under misleading inputs
📊 Uncertainty quantification metrics
🎯 Explicit & Implicit misleading experiments
🔬 Comprehensive model comparison
📝 Reproducible results and visualization

⚙️ Environment Installation

Before running the code, set up the required environments for: glm , llava, MiniCPM-V, mmstar.

📥 Installation Steps:

Navigate to the env folder.

Download and install the corresponding .yml environment files:

conda env create -f env/glm.yml
conda env create -f env/llava.yml
conda env create -f env/MiniCPM-V.yml
conda env create -f env/mmstar.yml

Activate the required environment:
```
conda activate <ENV_NAME>
```

📂 Data Preparation

Download the Multimodal Uncertainty Benchmark (MUB) dataset here.
Extract and place the downloaded images into the extract_img_all folder.

Evaluated Open-source and Close-source Models:
MiniCPM-v-v2; Phi-3-vision; YiVL-6b; Qwen-VL-Chat; Deepseek-VL-7b-Chat; LLaVA-NeXT-7b-vicuna; MiniCPM-Llama3-v2.5; GLM4V-9Bchat; CogVLM-chat; InternVL-Chat-V1-5; LLaVA-Next-34b; Yi-VL-34b; GPT-4o; Gemini-Pro; Claude3-OpusV; Glm-4V

🔍 Explicit Misleading Experiment

bash MR_test.sh

🔄 Implicit Misleading Experiment

🛠️ Step 1: Generate Implicit Misleading Samples

Open implicit/misleading_generate/my_tool.py.
Fill in your API Key.

Run:

bash implicit/misleading_generate/mislead_generate.sh

📊 Step 2: Run the Implicit Misleading Test

Use the generated data in implicit/mislead_output:

bash implicit/Implicit_MR_test/implicit_MR_test.sh

📈 Results & Visualization

Results are saved in:

📁 result/test_dataset_6
- .jsonl → Detailed outputs
- .txt → Model's Misleading Rate (MR)

📊 Convert Results to Table Format

Open extract2table/extract2table.py
Modify txt_folder_paths as needed.
Run:
```
python extract2table/extract2table.py
```
The formatted table is saved in: 📁 extract2table/Tables/

📚 Citation

If you use this work, please cite:

@article{yourpaper2024,
  title={Exploring Response Uncertainty in MLLMs: An Empirical Evaluation Under Misleading Scenarios},
  author={Authors' Names},
  journal={arXiv preprint arXiv:2411.02708},
  year={2024}
}

📬 Contact

For any issues, please open a GitHub issue or reach out via email: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
env		env
extract2table		extract2table
implicit		implicit
MR_test.sh		MR_test.sh
README.md		README.md
all_combined_items.jsonl		all_combined_items.jsonl
run.py		run.py
test_dataset.py		test_dataset.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring Response Uncertainty in MLLMs : An Empirical Evaluation Under Misleading Scenarios

📌 Key Features

⚙️ Environment Installation

📂 Data Preparation

🔍 Explicit Misleading Experiment

🔄 Implicit Misleading Experiment

🛠️ Step 1: Generate Implicit Misleading Samples

📊 Step 2: Run the Implicit Misleading Test

📈 Results & Visualization

📊 Convert Results to Table Format

📚 Citation

📬 Contact

About

Releases

Packages

Languages

Yunkaidang/uncertainty

Folders and files

Latest commit

History

Repository files navigation

Exploring Response Uncertainty in MLLMs : An Empirical Evaluation Under Misleading Scenarios

📌 Key Features

⚙️ Environment Installation

📂 Data Preparation

🔍 Explicit Misleading Experiment

🔄 Implicit Misleading Experiment

🛠️ Step 1: Generate Implicit Misleading Samples

📊 Step 2: Run the Implicit Misleading Test

📈 Results & Visualization

📊 Convert Results to Table Format

📚 Citation

📬 Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages