Biomedical Slot filling with Dense Passage Retrieval Accompanying code for our paper [1]. This code is only meant to be used so as to replicate our results.
The data, indices and BioSF slot filling dataset can be found here.
The BioSF dataset has been built using the following publicly available RE datasets:
Our finetuned models are available through huggingface:
- https://huggingface.co/healx/biomedical-dpr-qry-encoder
- https://huggingface.co/healx/biomedical-dpr-ctx-encoder
- https://huggingface.co/healx/biomedical-slot-filling-reader-base
- https://huggingface.co/healx/biomedical-slot-filling-reader-large
Although we provide also the code to train a DPR or a reader model, we here focus only in replicating our evaluation experiments. Assuming that we 've created a virtual environment and install requirements, we can then run either
PYTHONPATH=. python biomedical_slot_filling/scripts/train_eval_reader --eval
or
PYTHONPATH=. python biomedical_slot_filling/scripts/evaluate_retrieval
The above scripts will download the neccessary files from GDrive, the finetuned models from HF and perform the relevant evaluation.
To train a slot filling reader, we can run:
PYTHONPATH=. python biomedical_slot_filling/scripts/train_eval_reader.py --model-name-or-path dmis-lab/biobert-base-cased-v1.2 --train
Note: A large portion of our code is adapted from the following two repositories: