This is the repository for the official implementation of Detecting Training Data of Large Language Models via Expectation Maximization.
pip install -r requirements.txt
We mainly use WikiMIA and OLMoMIA (will be uploaded in 🤗 soon, or you can create your own using our scripts) as our benchmark datasets.
You may use MIMIR or your own dataset by modifying prepare_data
function in data_utils.py
to return a list of dictionaries with a format of {"input": text, "label", label}
, where label
is either 1 (member) or 0 (non-member).
Running python run.py
(or emmia.py
) with --target_model ${MODEL} --dataset_name ${DATASET}
performs membership inference attack on a target model ${MODEL}
and a target dataset ${DATASET}
and stores membership scores in output/${MODEL}/${DATASET}/score/${METHOD}.jsonl"
for each ${METHOD}
.
For OLMo models, you should specify --olmo_step
to select an intermediate checkpoint.
You can specify methods to use with the -m
argument.
The default baseline methods (by not specifiying the -m
argument) are Loss
, Ref
, Zlib
, Min-K
, and Min-K++
.
For Ref
, you need to specify a reference model with the --ref_model
argument.
A default reference model is EleutherAI/pythia-70m
for WikiMIA and stabilityai/stablelm-base-alpha-3b-v2
for OLMoMIA.
You can apply ReCaLL using a prefix by concatenating randomly selected n
shots defined by the --num_shots
argument.
Depending on where shots come from, there are three different methods: ReCaLL-Rand
from the entire dataset, ReCaLL-RandM
from members in the dataset, and ReCaLL-RandM
from non-members in the dataset.
For example, you can add arguments like -m ReCaLL-RandM ReCaLL-Rand ReCaLL-RandNM --num_shots "[1,2,4,8,12]"
.
With the -m ReCaLL-all
argument, you can apply ReCaLL on all data in the dataset by using each data in the dataset as a prefix.
After that, you can apply average baselines (Avg
, AvgP
) with the -m Avg
argument and TopPref
baseline with the -m TopPref -n $n
argument in emmia.py
.
You can run EM-MIA
by specifying initialization method(s) and prefix score update function(s) such as -i Loss Min-K++_20 -p AUC-ROC
in emmia.py
.
To compare different MIA methods, you can plot AUC-ROC curves and calucate evaluation metrics, AUC-ROC and TPR @ k% FPR (k=0.1, 1, 5, 10, 20).
You can also get score statistics and draw histograms for members and non-members.
You can specify methods to compare by their name or their prefixes like python eval.py output/${MODEL}/${DATASET} -m Loss Zlib Min-K_20 Min-K++_20 -p Ref ReCaLL-Avg ReCaLL-Rand --keep_used
.
This codebase is adapted from the following repositories: Min-K%, Min-K%++, and ReCaLL.
⭐ If you find our work (paper, implementation, and datasets) helpful, please consider citing our paper:
@article{kim2024detecting,
title={Detecting Training Data of Large Language Models via Expectation Maximization},
author={Kim, Gyuwan and Li, Yang and Spiliopoulou, Evangelia and Ma, Jie and Ballesteros, Miguel and Wang, William Yang},
journal={arXiv preprint arXiv:2410.07582},
year={2024}
}
For any inquiry, please open an issue or contact the authors directly.