Skip to content

Use ensemble learning to defend against FGSM attacks on the CIFAR-10 dataset.

Notifications You must be signed in to change notification settings

sodaball/AdversarialDefense_EnsembleLearning

Repository files navigation

README

This repository implements defense algorithms for the CIFAR10 dataset after being attacked by FGSM (Fast Gradient Sign Method) and PGD (Projected Gradient Descent). The model used is the Triplet Network, which involves the method of Ensemble Defense. By combining the predictions from multiple models, we aim to enhance overall performance and robustness.

Preparation

Create the following folders to store the respective datasets and models:

  • data: Original CIFAR10 dataset

  • data_adv: Adversarial samples

  • data_adv_adv: Enhanced adversarial samples

  • data_model: Weights of network models trained on the original dataset

  • data_model_adv: Weights of network models trained with adversarial training

  • data_model_f3: Weights of the enhanced adversarial training network models trained using the Triplet Network

Running the Project

  1. First, run 1_train.py and 2_acc.py in the 0_Learning directory using VGG16 and the CIFAR10 dataset to generate the network model data_model.pth.
  2. Then, use the trained network data_model.pth as input and run 0_generate.py in the 1_Adversary directory to generate adversarial samples, which will be stored in data_adv.
  3. Next, use the trained network data_model.pth and the adversarial samples as the training and test sets to run 0_train.py in the 2_Advertraining directory to generate the network model data_model_adv.pth.
  4. Use the adversarially trained network data_model_adv.pth to run 0_generate.py in the 1_Adversary directory again to generate more adversarial samples, which will be stored in data_adv_adv.
  5. Finally, use the adversarial samples generated in step 2 and step 4 as the training and test sets to run 0_train.py in the 3_train_f3 directory to generate an enhanced adversarial training network model, which will be saved in data_model_f3.
  6. Use the networks generated in steps 1, 3, and 5 to form the Triplet Network for simultaneous decision-making (voting).

Results

Accuracy:

  • Original Network:
    • Test clean samples: 0.8112
    • Test adversarial samples: 0.1549
  • Adversarially Trained Network:
    • Test clean samples: 0.7527
    • Test adversarial samples: 0.95488
  • Network Trained with "Adversarial Samples and Adversarially Trained Network Generated Adversarial Samples":
    • Clean samples: 0.7788
    • Adversarial samples: 0.96594
  • Triplet Network:
    • Test clean samples: 0.8161
    • Test adversarial samples: 0.94066

0_Learning

1_train.py

python 1_train.py --data_choose 0 --model_choose 0 --save_path '../data_model/data_model.pth'

Trains a VGG16 network on the CIFAR10 dataset, and saves the trained model results in the data_model folder.

Results of 1_train.py:

1_train

loss: 0.30944
Training set accuracy: 0.90374
Test set accuracy: 0.8112

Results of 2_acc.py:

Training set accuracy: 0.93818
Validation set accuracy: 0.8112

1_Adversary

0_generate.py

Uses the FGSM attack method with an epsilon value of 0.03.

For the train dataset using FGSM:

python 0_generate.py --model_choose 0 --data_choose 0 --if_train True --adversary 'fgsm' --save_path '../data_model/data_model.pth' --img_path '../data_adv/adv_fgsm_0.03_train.h5py'

For the test dataset using FGSM:

python 0_generate.py --model_choose 0 --data_choose 0 --if_train False --adversary 'fgsm' --save_path '../data_model/data_model.pth' --img_path '../data_adv/adv_fgsm_0.03_test.h5py'

The generated adversarial samples are stored in the data_adv folder in h5py format.

2_acc.py

Testing the accuracy on the train dataset after FGSM:

python 2_acc.py --img_path '../data_adv/adv_fgsm_0.03_train.h5py' --save_path '../data_model/data_model.pth' --model_choose 0 --data_choose 0

Result:

Accuracy: 0.15486

Testing the accuracy on the test dataset after FGSM:

python 2_acc.py --img_path '../data_adv/adv_fgsm_0.03_test.h5py' --save_path '../data_model/data_model.pth' --model_choose 0 --data_choose 0

Result:

Accuracy: 0.15492

2_Advertraining

0_train.py

Trains using the adversarial samples generated by FGSM on the train and test datasets.

python 0_train.py --data_choose 0 --model_choose 0 --train_adv '../data_adv/adv_fgsm_0.03_train.h5py' --test_adv '../data_adv/adv_fgsm_0.03_test.h5py' --save_path_ori '../data_model/data_model.pth' --save_path_adv '../data_model_adv/data_model_adv.pth'

The model trained using the adversarial samples generated by FGSM is saved in data_model_adv.

Result:

loss: 0.22066
Training set accuracy: 0.9316
Adversarial sample test set accuracy: 0.95488
Original sample test set accuracy: 0.75270

3_train_f3

0_train.py

Result:

loss: 0.16096
Training set accuracy: 0.95075
Adversarial sample test set accuracy: 0.96594
Original sample test set accuracy: 0.77880

0_train

1_acc.py

Result:

Test set clean sample accuracy: 0.81610
Test set adversarial sample accuracy: 0.94066

About

Use ensemble learning to defend against FGSM attacks on the CIFAR-10 dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages