Skip to content

Latest commit

 

History

History
328 lines (278 loc) · 10.7 KB

README.md

File metadata and controls

328 lines (278 loc) · 10.7 KB

Astraios: Parameter-Efficient Instruction Tuning Code Language Models

Astraios

This repository provides an overview of all components from the paper Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models.

Overview

Data CommitPackFT+OASST Filtered version of CommitPack and OASST for high-quality commit messages that resemble instructions
Model Astraios-1B Collection of StarCoderBase-1B models instruction tuned on CommitPackFT + OASST with different tuning methods
Astraios-3B Collection of StarCoderBase-3B (3B parameters) models instruction tuned on CommitPackFT + OASST with different tuning methods
Astraios-7B Collection of StarCoderBase-7B (7B parameters) models instruction tuned on CommitPackFT + OASST with different tuning methods
Astraios-16B Collection of StarCoderBase-16B (16B parameters) models instruction tuned on CommitPackFT + OASST with different tuning methods
Evaluation BigCloneBench Dataset for clone detection; We use 2,000 samples for evaluation
Devign Dataset for defect detection; We use 2,000 samples for evaluation
HumanEvalPack Extension of OpenAI's HumanEval to cover 3 scenarios across 6 languages
ReCode Dataset for the robustness of code generation, covering 4 variants
Asleep At The Keyboard Datasets for security of code generation; We use DoW for evaluation

PEFT

Setup: Run the bash code below to set up the PEFT methods used in our work. We additionally implement AdapterH, AdapterP and Parallel methods based on the peft==0.6.0.dev0. For more information, please refer to the peft folder.

pip install git+https://github.com/bigcode-project/astraios#subdirectory=peft

Notes:

  • As Prefix Tuning does not work for StarCoder training, we do not evaluate this method.
  • For any configuration issues, please refer to the original PEFT.

Evaluation

  1. Setup: Run the bash code below to set up the evaluation repository.
git clone -b astraios https://github.com/bigcode-project/bigcode-evaluation-harness
cd bigcode-evaluation-harness
pip install -q -r requirements.txt
accelerate config
  1. Run: All evaluation scripts are in evaluation folder. Run each script via bash.

We use astraios-1b-lora as an example and use the bash code to run the following tasks:

  • Clone Detection
accelerate launch main.py \
--model bigcode/starcoderbase-1b  \
--peft_model bigcode/astraios-1b-lora \
--tasks clone_detection \
--do_sample False \
--batch_size 1 \
--save_generations \
--trust_remote_code \
--save_generations_path generations_clone_detection_astraios-1b-lora.json \
--max_length_generation 512
  • Defect Detection
accelerate launch main.py \
--model bigcode/starcoderbase-1b  \
--peft_model bigcode/astraios-1b-lora \
--tasks clone_detection \
--do_sample False \
--batch_size 1 \
--save_generations \
--trust_remote_code \
--save_generations_path generations_defect_detection_astraios-1b-lora.json \
--max_length_generation 512
  • HumanEvalSynthesize-Python
accelerate launch main.py \
--model bigcode/starcoderbase-1b  \
--peft_model bigcode/astraios-1b-lora \
--tasks humanevalsynthesize-python \
--do_sample True \
--temperature 0.2 \
--n_samples 20 \
--batch_size 5 \
--allow_code_execution \
--save_generations \
--trust_remote_code \
--prompt octocoder \
--save_generations_path generations_humanevalsynthesizepython_astraios-1b-lora.json \
--metric_output_path evaluation_humanevalsynthesizepython_astraios-1b-lora.json \
--max_length_generation 2048
  • HumanEvalFix-Python
accelerate launch main.py \
--model bigcode/starcoderbase-1b  \
--peft_model bigcode/astraios-1b-lora \
--tasks humanevalfixtests-python \
--do_sample True \
--temperature 0.2 \
--n_samples 20 \
--batch_size 1 \
--allow_code_execution \
--save_generations \
--trust_remote_code \
--prompt octocoder \
--save_generations_path generations_humanevalfixpython_astraios-1b-lora.json \
--metric_output_path evaluation_humanevalfixpython_astraios-1b-lora.json \
--max_length_generation 2048
  • HumanEvalExplain-Python
accelerate launch main.py \
--model bigcode/starcoderbase-1b \
--peft_model bigcode/astraios-1b-lora \
--tasks humanevalexplaindescribe-python \
--generation_only \
--do_sample True \
--temperature 0.2 \
--n_samples 20 \
--batch_size 5 \
--allow_code_execution \
--save_generations \
--trust_remote_code \
--prompt octocoder \
--save_generations_path generations_humanevalexplaindescribe-python_astraios-1b-lora.json \
--max_length_generation 2048 

accelerate launch main.py \
--model bigcode/starcoderbase-1b  \
--peft_model bigcode/astraios-1b-lora \
--tasks humanevalexplainsynthesize-python \
--do_sample True \
--temperature 0.2 \
--n_samples 1 \
--batch_size 1 \
--allow_code_execution \
--save_generations \
--trust_remote_code \
--prompt octocoder \
--load_data_path generations_humanevalexplainsynthesize-python_astraios-1b-lora.json \
--save_generations_path generations_humanevalexplainsynthesize-python_astraios-1b-lora.json \
--metric_output_path evaluation_humanevalexplainpython_astraios-1b-lora.json \
--max_length_generation 2048
  • ReCode-Format
accelerate launch main.py \
--model bigcode/starcoderbase-1b  \
--peft_model bigcode/astraios-1b-lora \
--tasks perturbed-humaneval-format-num_seeds_5 \
--do_sample False \
--batch_size 1 \
--allow_code_execution \
--save_generations \
--trust_remote_code \
--n_samples 1 \
--batch_size 1 \
--allow_code_execution \
--save_generations \
--trust_remote_code \
--prompt octocoder \
--save_generations_path generations_perturbed-humaneval-format-num_seeds_5_astraios-1b-lora.json \
--metric_output_path evaluation_perturbed-humaneval-format-num_seeds_5_astraios-1b-lora.json \
--max_length_generation 1024
  • AATK-DoW
accelerate launch main.py \
--model bigcode/starcoderbase-1b  \
--peft_model bigcode/astraios-1b-lora \
--tasks asleep_completion \
--do_sample True \
--temperature 0.2 \
--n_samples 20 \
--batch_size 1 \
--generation_only \
--save_generations \
--trust_remote_code \
--prompt octocoder \
--save_generations_path generations_asleep_completion_astraios-1b-lora.json \
--metric_output_path evaluation_asleep_completion_astraios-1b-lora.json \
--max_length_generation 1024

Note:

  • When evaluating FFT models, --peft_model should be removed and FFT model names need to pass with --model, e.g.:
accelerate launch main.py \
--model bigcode/astraios-1b-fft  \
--tasks clone_detection \
--do_sample False \
--batch_size 1 \
--save_generations \
--trust_remote_code \
--save_generations_path generations_clone_detection_astraios-1b-fft.json \
--max_length_generation 512

Training

PEFT

The finetuning python script is at finetune.py. Coresponding PEFT configurations are stored in peft_config.py. To train all models with PEFT, run the bash code:

sh run_peft.sh

Note:

  • --gradient_accumulation_steps 32 is for a single GPU. If the model is trained with 8 GPUs, gradient_accumulation_steps should be adjusted to 4.

FFT

To train all models with FFT, run the bash code:

sh run_fft.sh

Note:

  • --gradient_accumulation_steps 32 is for a single GPU. If the model is trained with 8 GPUs, gradient_accumulation_steps should be adjusted to 4.

Outputs

Outputs are under their corresponding subfolders in outputs folder.

Note:

  • Some outputs may be missing as they were not saved initially.

Visuals

Figures:

  • All figures are created via this colab notebook.

  • The figure of Astraios is generated via DALL-E with the prompt of A fantasy-inspired, adorable illustration of Astraios, the Greek god of dusk. The setting is a serene evening landscape with a gradient sky transition.

Licenses

Everything is licensed as permissively as possible to us.

The evaluation repository Code Generation LM Evaluation Harness is licensed under the Apache-2.0 license.

PEFT library is licensed under Apache-2.0 license.

All Astraios models are licensed under the same license as StarCoder (Commercial except for use cases deemed harmful).

The remaining code originally created in this repository is licensed under the MIT License.

Todo List

  • Organize the file names in outputs/ folder.
  • Organize the bash script in evaluation/ folder.
  • Merge the PR to bigcode-evaluation-harness.

Citation

@article{zhuo2024astraios,
      title={Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models}, 
      author={Terry Yue Zhuo and Armel Zebaze and Nitchakarn Suppattarachai and Leandro von Werra and Harm de Vries and Qian Liu and Niklas Muennighoff},
      journal={https://arxiv.org/abs/2401.00788},
      year={2024}
}