This repo contains code for the work done at Simula and UiO focused on fine-tuning LLMs using standard full fine-tuning approach along with parameter-efficient fine-tuning (PEFT) techniques which are then compared on several benchmarks. Work is based on the paper and repo by Jiang et al. paper, codebase.
There are various requirements in order to run this project. Most importantly, Quixbugs, Defects4j and HumanEval-Java benchmarks have to be downloaded and installed. Furthermore, jesper is needed to be extracted from the repo in order to compile the code in Java.
Afterwards we can install requirements requirements.txt
with pip and run the selected code.
There are several files to be ran including for example
- Fine-tuning of various models using
finetune.py
- Benchmarking of various models using
benchmarks/validate.py
- Prepare inputs and outputs of models using
benchmarks/generate_inputs.py
,benchmarks/generate_outputs.py
- Perform post-processing of results by
benchmarks/analyse.py
- PEFT fine-tuning with
lora.py
andia3.py