SLUG

Targeted Unlearning with Single Layer Unlearning Gradient (ICML 2025)

Abstract

Machine unlearning methods aim to remove sensitive or unwanted content from trained models, but typically demand extensive model updates at significant computational cost while potentially degrading model performance on both related and unrelated tasks. We propose Single Layer Unlearning Gradient (SLUG) as an efficient method to unlearn targeted information by updating a single critical layer using a one-time gradient computation. SLUG uses layer importance and gradient alignment metrics to identify the optimal layer for targeted information removal while preserving the model utility. We demonstrate the effectiveness of SLUG for CLIP, Stable Diffusion, and vision-language models (VLMs) in removing concrete (e.g., identities and objects) and abstract concepts (e.g., artistic styles). On the UnlearnCanvas benchmark, SLUG achieves comparable unlearning performance to existing methods while requiring significantly less computational resources. Our proposed approach offers a practical solution for targeted unlearning that is computationally efficient and precise.

SLUG framework

Overview of our proposed Single Layer Unlearning Gradient (SLUG) framework. Given an unlearning query, such as removing an identity like Elon Musk, we first curate or generate a forget set containing relevant data and a retain set with data points we want to preserve. Using these datasets, we calculate and store the model gradients. Based on these gradients, we identify the important layers to update for unlearning. We then take a step along the forget gradients of a single layer and evaluate the model's unlearning performance. To determine a suitable step size $\lambda$, we employ a binary search. After unlearning, the specified concepts are effectively erased while retaining the model's overall utility.

Examples of Unlearning on Stable Diffusion

Qualitative evaluation on unlearning copyright characters Iron man and Mickey Mouse, which can potentially used for unauthorized content generation, from the Stable Diffusion (SD). Our method precisely unlearned copyright protected concepts from SD, while the image generation quality on other concepts is highly preserved.

📋 Requirements

To install requirements:

conda env create -f environment.yml

Datasets (put under data folder):

laion-400M, the training set of CLIP model, from which we sample foget set and retain set. First download the parquet files, and then use img2dataset to download the images, use the following code. The image-text pairs are stored in tar files such as 00000.tar, 00001.tar and so on. We provide data samples here.

ImageNet 2012. We use the imagenet validation set to evaluate CLIP model general performance. Official request access here. Download and unzip ILSVRC2012_img_val.tar under data/ImageNet/, and run bash valprep.sh to prepare the dataset.

CelebA. We sample identities in CelebA dataset to forget. The dataset is available here, or GoogleDrive from CelebA authors. Request the CelebA dataset authors for the name of identities.

Update data_root in src/clip/a0_eval_celeba.py to the absolute path of where you stored the experimental data.

Data folder structure

The data folder is structured as:

data ├── celeba │ ├── img_align_celeba │ │ ├── 010905.jpg │ │ ├── 010906.jpg │ │ └── ... │ └── frequent_celebs.txt ├── ImageNet │ └── val │ ├── n01440764 │ │ ├── ILSVRC2012_val_00000293.JPEG │ │ ├── ILSVRC2012_val_00002138.JPEG │ │ └── ... │ ├── n01443537 │ └── ... ├── laion └── laion400m ├── 00000_stats.json ├── 00000.parquet └── 00000.tar

📝 Unlearning procedure

Prepare forget and retain set. Given an unlearning task, we first curate a forget set containing relevant image-text pairs, then sample the retain set from the original training set (e.g. one shard of laion). The script for curating forget set from laion dataset is src/clip/a0_create_tar.py

Calculate forget and retain gradient.

Update the route for arguments --train-data, --forget-data, and --imagenet-val in scripts/run_compute_grad.sh, then run

bash scripts/run_compute_grad.sh

This will generate the forget gradient file stored in folder SLUG/results/grads.

Perform the Single Layer Single Gradient update by running

bash scripts/run_clip_slug.sh

This will generate the Pareto-front plots, consine simularity matrices, and step size searching log stored at SLUG/results/clip.

Run comparing methods

bash scripts/run_clip_comparison.sh

Unlearning other celebrity name / object concept

Create the forget set dataset file

python src/clip/a0_create_tar.py --name [celebrity name/object concept]

This will create a directory with selected images that are associated with provided celebrity name/concept from laion shard file, under data/laion/laion400m. And a .tar file containing the selected images, under data/tar_files/{concept_name}.tar.

Repeat the unlearning procedure to generate unlearning gradient using the created .tar file, and perform unlearning.

Unlearning experiment on Stable diffusion

Before start, generate necessary dataset files and gradient files following steps described in Unlearning procedure. Run Jupyter notebook notebooks/experiment_stable_diffusion.ipynb

Unlearning experiment on Vision-language models

Before start, generate necessary dataset files and gradient files following steps described in Unlearning procedure. Run Jupyter notebook notebooks/experiment_vision_language.ipynb

Evaluation on UnlearnCanvas

First clone UnlearnCanvas repository under ./data

cd data git clone https://github.com/OPTML-Group/UnlearnCanvas.git

Download UnlearnCanvas dataset and pretraind models following the instructions in the UnlearnCanvas repository. The UnlearnCanvas dataset folder is structured as:

data └── UnlearnCanvas └── data ├── Abstractionism │ ├── Architectures │ │ ├── 1.jpg │ │ ├── 2.jpg │ │ └── ... │ ├── Bears │ ├── Birds │ └── ... ├── Artist_Sketch └── ...

Generate .tar dataset files by running:

cd src/clip python a0_create_tar_ucanvas.py

Following gradient computing step similar to above (Unlearning procedure 2.), to generate gradient files for forget set:

cd [BACK TO SLUG/] bash scripts/run_compute_grad_uncanvas.sh

Lastly, run UnlearnCanvas evaluation:

bash scripts/run_uncanvas.sh

Citation

@inproceedings{ cai2025targeted, title={Targeted Unlearning with Single Layer Unlearning Gradient}, author={Zikui Cai and Yaoteng Tan and M. Salman Asif}, booktitle={Forty-second International Conference on Machine Learning}, year={2025}, url={https://openreview.net/forum?id=6Ofb0cGXb5} }

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
data		data
doc		doc
notebooks		notebooks
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SLUG

Abstract

SLUG framework

Examples of Unlearning on Stable Diffusion

📋 Requirements

Datasets (put under data folder):

Data folder structure

📝 Unlearning procedure

Unlearning other celebrity name / object concept

Unlearning experiment on Stable diffusion

Unlearning experiment on Vision-language models

Evaluation on UnlearnCanvas

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Languages

CSIPlab/SLUG

Folders and files

Latest commit

History

Repository files navigation

SLUG

Abstract

SLUG framework

Examples of Unlearning on Stable Diffusion

📋 Requirements

Datasets (put under data folder):

Data folder structure

📝 Unlearning procedure

Unlearning other celebrity name / object concept

Unlearning experiment on Stable diffusion

Unlearning experiment on Vision-language models

Evaluation on UnlearnCanvas

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages