Skip to content

Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/

License

Notifications You must be signed in to change notification settings

Betswish/MIRAGE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Toward faithful answer attribution with model internals 🌴


Latest update: Our paper has been accepted by the EMNLP 2024 Main Conference! 🎉 Also check our demo here!

Authors (* Equal contribution): Jirui Qi*Gabriele Sarti*Raquel FernándezArianna Bisazza

Note

This repository provides an easy-to-use MIRAGE framework for analyzing the groundedness of RAG generation to the retrieved documents. To reproduce the paper results, please take a look at this repo.

Abstract: Ensuring the verifiability of model answers is a fundamental challenge for retrieval-augmented generation (RAG) in the question answering (QA) domain. Recently, self-citation prompting was proposed to make large language models (LLMs) generate citations to supporting documents along with their answers. However, self-citing LLMs often struggle to match the required format, refer to non-existent sources, and fail to faithfully reflect LLMs' context usage throughout the generation. In this work, we present MIRAGE --Model Internals-based RAG Explanations -- a plug-and-play approach using model internals for faithful answer attribution in RAG applications. MIRAGE detects context-sensitive answer tokens and pairs them with retrieved documents contributing to their prediction via saliency methods. We evaluate our proposed approach on a multilingual extractive QA dataset, finding high agreement with human answer attribution. On open-ended QA, MIRAGE achieves citation quality and efficiency comparable to self-citation while also allowing for a finer-grained control of attribution parameters. Our qualitative evaluation highlights the faithfulness of MIRAGE's attributions and underscores the promising application of model internals for RAG answer attribution.

If you find the paper helpful and use the content, we kindly suggest you cite through:

@inproceedings{qi-etal-2024-model,
    title = "Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation",
    author = "Qi, Jirui  and
      Sarti, Gabriele  and
      Fern{\'a}ndez, Raquel  and
      Bisazza, Arianna",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-main.347",
    doi = "10.18653/v1/2024.emnlp-main.347",
    pages = "6037--6053",
    abstract = "Ensuring the verifiability of model answers is a fundamental challenge for retrieval-augmented generation (RAG) in the question answering (QA) domain. Recently, self-citation prompting was proposed to make large language models (LLMs) generate citations to supporting documents along with their answers. However, self-citing LLMs often struggle to match the required format, refer to non-existent sources, and fail to faithfully reflect LLMs{'} context usage throughout the generation. In this work, we present MIRAGE {--} Model Internals-based RAG Explanations {--} a plug-and-play approach using model internals for faithful answer attribution in RAG applications. MIRAGE detects context-sensitive answer tokens and pairs them with retrieved documents contributing to their prediction via saliency methods. We evaluate our proposed approach on a multilingual extractive QA dataset, finding high agreement with human answer attribution. On open-ended QA, MIRAGE achieves citation quality and efficiency comparable to self-citation while also allowing for a finer-grained control of attribution parameters. Our qualitative evaluation highlights the faithfulness of MIRAGE{'}s attributions and underscores the promising application of model internals for RAG answer attribution. Code and data released at https://github.com/Betswish/MIRAGE.",
}

Environment:

For a quick start, you may load our environment easily with Conda:

conda env create -f MIRAGE.yaml

Alternatively, you can install all packages by yourself:

Python: 3.9.19

Packages: pip install -r requirements.txt

Quick Start

For a quick start, you only need to put your RAG Data file in data_input/ folder and run the following one command to get the LLM outputs with answer attribution (e.g. LLaMA2 with standard prompt):

python mirage.py --f data_input/EXAMPLE.json --config configs/llama2_standard_prompt.yaml

The data file should be in JSON format. For example, suppose you have two questions, each provided with two retrieved docs:

[
  {
    "question": "YOUR QUESTION",
    "docs": [
      {
        "title": "TITLE OF RETRIEVED DOC",
        "text": "TEXT OF RETRIEVED DOC"
      },
      {
        "title": "TITLE OF RETRIEVED DOC",
        "text": "TEXT OF RETRIEVED DOC"
      }
    ]
  },
  {
    "question": "YOUR QUESTION",
    "docs": [
      {
        "title": "TITLE OF RETRIEVED DOC",
        "text": "TEXT OF RETRIEVED DOC"
      },
      {
        "title": "TITLE OF RETRIEVED DOC",
        "text": "TEXT OF RETRIEVED DOC"
      }
    ]
  }
]

The data with LLM's raw outputs will be saved in the folder data_input_with_ans/, and the attributed answers will be saved in res_AA/ folder.

You may also check internal_res/ for the model internals obtained by MIRAGE. The internal of each instance is saved as an individual JSON file, which can be used for advanced functions below.

Full functions

The all parameters for mirage.py are listed below:

  • f: Path to the input file.
  • config: Path to the configuration file, containing the generation parameters and prompts.
  • CTI: CTI threshold. This means how many standard deviations over average, default 1.
  • CCI: CCI threshold. Using Top k strategy if k > 0; otherwise Top (-k)% if k < 0, default -5.

Advanced Feature 1

If you already have LLM generations (e.g. LLaMA2 with standard prompt) in the data file, like:

[
  {
    "question": "YOUR QUESTION",
    "docs": [
      {
        "title": "TITLE OF RETRIEVED DOC",
        "text": "TEXT OF RETRIEVED DOC"
      },
      {
        "title": "TITLE OF RETRIEVED DOC",
        "text": "TEXT OF RETRIEVED DOC"
      }
    ],
    "output": "LLM GENERATED ANSWER"
  },
  {
    "question": "YOUR QUESTION",
    "docs": [
      {
        "title": "TITLE OF RETRIEVED DOC",
        "text": "TEXT OF RETRIEVED DOC"
      },
      {
        "title": "TITLE OF RETRIEVED DOC",
        "text": "TEXT OF RETRIEVED DOC"
      }
    ],
    "output": "LLM GENERATED ANSWER"
  }
]

put it in the data_input_with_ans/ folder and specify --f_with_ans to get the answer attribution for the LLM outputs you already have:

mkdir data_input_with_ans
python mirage.py --f data_input_with_ans/EXAMPLE_WITH_ANS.json --config configs/llama2_standard_prompt.yaml --f_with_ans

Advanced Feature 2

After a first run, you should have got:

  • a JSON file containing LLM outputs in data_input_with_ans/, whether generated by our code or uploaded by yourself, e.g. EXAMPLE_WITH_ANS.json.
  • a JSON file in the res_AA/ folder with MIRAGE-attributed answers, named after res_AA/EXAMPLE.json.mirage_CTI_X_CCI_X.
  • multiple JSON files saving model internals obtained by MIRAGE at internal_res/, like internal_res/EXAMPLE_WITH_ANS-0.json, internal_res/EXAMPLE_WITH_ANS-1.json, etc.

If you plan to try different CTI and CCI thresholds, set the data file with LLM outputs as --f and specify --f_with_ans and --only_cite. It allows you to try different combinations of CTI and CCI thresholds with the existing model internal files, instead of analyzing model internals from scratch, saving you treasure time. :)

python mirage.py --f data_input_with_ans/EXAMPLE_WITH_ANS.json --config configs/llama2_standard_prompt.yaml --f_with_ans --only_cite --CTI X --CCI X

Releases

No releases published

Packages

No packages published

Languages