This repository contains the starter code for the LLM-Merging competition.
- Please do not specify any device_id in the code because the device_id might not hold in our setup. If you need to specify a device_id in your setup, one solution is to use environment variables like
export CUDA_VISIBLE_DEVICES=0
- Please do not specify any filepaths because they may not be the same in our setup. If you need to specify the HuggingFace cache, one solution is to use environment variables like
export HUGGINGFACE_HUB_CACHE=/tmp/
and then access this path in Python via
path=os.environ["HUGGINGFACE_HUB_CACHE"]
- When running
tar
on this repoLLM-Merging
to submit it, please ensure this directory is calledLLM-Merging
and not renamed to any directories. This can cause issues when evaluating your submissions.
The library was tested on CUDA 10.1 on an A6000.
conda env create -f environment.yml --name llm-merging
conda activate llm-merging
export PYTHONPATH=`pwd`
Authentication tokens are required for certain models like Llama2, which require users to agree to specific terms. You can find the authentication token here.
export HF_AUTH_TOKEN=""
Do not modify any files other than the new file you create and setup.py
. Doing so can result in the grounds for invalidating your submission. If you need to change code in other files, feel free to open a pull request.
-
To add a new merging method, create a new file in
llm_merging/merging
.This file should implement
__init__.py
andmerge.py
functions and extendllm_merging/merging/Merges
. Seellm_merging/merging/FlanT5Avg.py
orllm_merging/merging/LlamaAvg.py
for examples. -
Modify
setup.py
and add an entry with the merging method inllm_merging.merging.Merges
.For example, the entry
llama_avg = llm_merging.merging.LlamaAvg:LlamaAvg
indicates the method is calledllama_avg
and the file is atllm_merging/merging/LlamaAvg
.Any additional required libraries can be specified in
setup.py
.
python llm_merging/setup.py install
python llm_merging/main.py -m {merging_method}
The datasets (CosmosQA and XSum) are mainly included to ensure the merging method (with evaluation on those datasets) runs in under the 1-hour time limit. Our results on llama_avg
are {"cosmos_qa": {"accuracy": 0.234}, "xsum": {"rouge1": 0.123, "rouge2": 0.023, "rougeL": 0.093, "rougeLsum": 0.102}}
, which run in about 25 minutes on our A6000.
After modifying the file, tar the file into a tarball using the command:
tar -cvf {merging_method}.tar LLM-Merging
Submit the tar file using this form
The leaderboard of the submitted solutions can be found here. Please note that your submission might not appear on the leaderboard immediately, as it is updated every few days. If you encounter any issues, please contact us.
Note: This submission method is only temporary and another automatic submission method should be comming soon.