Skip to content

dominiksinsaarland/document-level-FEVER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Evidence Selection as a Token-Level Prediction Task

This repo contains the code and models for the paper (Stammbach, 2021)

Installation

Assuming Anaconda and linux, the environment can be installed with the following command:

conda create -n FEVER_bigbird python=3.6
conda activate FEVER_bigbird

conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Models

The models (pytorch models) can be downloaded here:

Run the models on sample data

python src/main.py --do_predict --model_name sentence-selection-bigbird-base --eval_file sample_data.jsonl --predict_filename predictions_sentence_retrieval.csv

sample_data.jsonl points to a file where each line is an example of a (claim, Wiki-page) pair

  • id # the claim ID
  • claim # the claim
  • page # the page title
  • sentences # a list -- essentally the "lines" in the official FEVER wiki-pages for a given document (where the document is split by "\n")
  • label_list # a list, 1 if a sentence is part of any annotated evidence set for a given claim, 0 otherwise
  • sentence_IDS # a list, np.arange(len(sentences))

output is a dataframe where we store for each sentence predicted by the model

  • claim_id
  • page_sentence # a tuple (Wikipage_Title, sentence_ID), for example ('2014_San_Francisco_49ers_season', 3)
  • y # 1 if label_list above was 1, 0 otherwise
  • predictions # token-level predictions for this sentence
  • score # np.mean(predictions), model is confident that this sentence is evidence if score > 0

re-train the models

point to train_file and eval_file, both in the format described above, and add do_train flag

python src/main.py --do_train --do_predict --model_name sentence-selection-bigbird-base --eval_file sample_data.jsonl --train_file sample_data.jsonl --predict_filename predictions_sentence_retrieval.csv

The pipeline

  • takes a first pass over all (claim, WikiPage) pairs where Wikipages are predicted by (Hanselowski et al., 2018) and the FEVER baseline
  • extracts all sentences it is confident that they are evidence in that pass, model_input is [CLS] claim [SEP] WikiPage [SEP]
  • retrieves conditioned evidence as explained in (Stammbach and Neumann, 2019)
  • retrieves hyperlinks from evidence_sentences and takes a second pass over all (claim, hyperlink) pairs where model_input is [CLS] claim, evidence_sentence [SEP] HyperlinkPage [SEP]
  • sorts all predicted evidence sentences for a claim in descending order
  • takes the five highest scoring sentences for each claim and concatenates those
  • predicts a label for each (claim, retrieved_evidence) pair using the RTE model (trained with an outdated huggingface sequence classification demo script)

MultiHop

For generating the multihop dataset, we need to download the fever.db, see how to obtain this here

After having predicted a first pass, we can retrieve multihop pages by running

python src/retrieve_multihop_evidence.py --db_file fever.db --predictions predictions_sentence_retrieval.csv --fever_data dev.jsonl --outfile_name multi_evidence_sample_data.jsonl

Afterwards, we can predict these sentences the same way as before

python src/main.py --do_predict --model_name sentence-selection-bigbird-base --eval_file multi_evidence_sample_data.jsonl --predict_filename predictions_multihop_sentence_retrieval.csv

questions

If anything should not work or is unclear, please don't hesitate to contact the authors

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages