Download data from PubMedQA and following the steps of splitting dataset.
Copy the files pqal_fold0/train_set.json
, pqal_fold0/dev_set.json
, test_set.json
and test_ground_truth.json
to ../../data/PubMedQA/raw
Then, you can process the data by:
bash preprocess.sh # for BioGPT
or
bash preprocess_large.sh # for BioGPT-Large
We provide our fine-tuned model on the task. See here
You can inference and evaluate the model on the test set by:
bash infer.sh # for BioGPT
or
bash infer_large.sh # for BioGPT-Large