This repository implements defense algorithms for the CIFAR10 dataset after being attacked by FGSM (Fast Gradient Sign Method) and PGD (Projected Gradient Descent). The model used is the Triplet Network, which involves the method of Ensemble Defense. By combining the predictions from multiple models, we aim to enhance overall performance and robustness.
Create the following folders to store the respective datasets and models:
-
data
: Original CIFAR10 dataset -
data_adv
: Adversarial samples -
data_adv_adv
: Enhanced adversarial samples -
data_model
: Weights of network models trained on the original dataset -
data_model_adv
: Weights of network models trained with adversarial training -
data_model_f3
: Weights of the enhanced adversarial training network models trained using the Triplet Network
- First, run
1_train.py
and2_acc.py
in the0_Learning
directory using VGG16 and the CIFAR10 dataset to generate the network modeldata_model.pth
. - Then, use the trained network
data_model.pth
as input and run0_generate.py
in the1_Adversary
directory to generate adversarial samples, which will be stored indata_adv
. - Next, use the trained network
data_model.pth
and the adversarial samples as the training and test sets to run0_train.py
in the2_Advertraining
directory to generate the network modeldata_model_adv.pth
. - Use the adversarially trained network
data_model_adv.pth
to run0_generate.py
in the1_Adversary
directory again to generate more adversarial samples, which will be stored indata_adv_adv
. - Finally, use the adversarial samples generated in step 2 and step 4 as the training and test sets to run
0_train.py
in the3_train_f3
directory to generate an enhanced adversarial training network model, which will be saved indata_model_f3
. - Use the networks generated in steps 1, 3, and 5 to form the Triplet Network for simultaneous decision-making (voting).
Accuracy:
- Original Network:
- Test clean samples: 0.8112
- Test adversarial samples: 0.1549
- Adversarially Trained Network:
- Test clean samples: 0.7527
- Test adversarial samples: 0.95488
- Network Trained with "Adversarial Samples and Adversarially Trained Network Generated Adversarial Samples":
- Clean samples: 0.7788
- Adversarial samples: 0.96594
- Triplet Network:
- Test clean samples: 0.8161
- Test adversarial samples: 0.94066
python 1_train.py --data_choose 0 --model_choose 0 --save_path '../data_model/data_model.pth'
Trains a VGG16 network on the CIFAR10 dataset, and saves the trained model results in the data_model
folder.
Results of 1_train.py
:
loss: 0.30944
Training set accuracy: 0.90374
Test set accuracy: 0.8112
Results of 2_acc.py
:
Training set accuracy: 0.93818
Validation set accuracy: 0.8112
Uses the FGSM attack method with an epsilon value of 0.03.
For the train dataset using FGSM:
python 0_generate.py --model_choose 0 --data_choose 0 --if_train True --adversary 'fgsm' --save_path '../data_model/data_model.pth' --img_path '../data_adv/adv_fgsm_0.03_train.h5py'
For the test dataset using FGSM:
python 0_generate.py --model_choose 0 --data_choose 0 --if_train False --adversary 'fgsm' --save_path '../data_model/data_model.pth' --img_path '../data_adv/adv_fgsm_0.03_test.h5py'
The generated adversarial samples are stored in the data_adv
folder in h5py
format.
Testing the accuracy on the train dataset after FGSM:
python 2_acc.py --img_path '../data_adv/adv_fgsm_0.03_train.h5py' --save_path '../data_model/data_model.pth' --model_choose 0 --data_choose 0
Result:
Accuracy: 0.15486
Testing the accuracy on the test dataset after FGSM:
python 2_acc.py --img_path '../data_adv/adv_fgsm_0.03_test.h5py' --save_path '../data_model/data_model.pth' --model_choose 0 --data_choose 0
Result:
Accuracy: 0.15492
Trains using the adversarial samples generated by FGSM on the train and test datasets.
python 0_train.py --data_choose 0 --model_choose 0 --train_adv '../data_adv/adv_fgsm_0.03_train.h5py' --test_adv '../data_adv/adv_fgsm_0.03_test.h5py' --save_path_ori '../data_model/data_model.pth' --save_path_adv '../data_model_adv/data_model_adv.pth'
The model trained using the adversarial samples generated by FGSM is saved in data_model_adv
.
Result:
loss: 0.22066
Training set accuracy: 0.9316
Adversarial sample test set accuracy: 0.95488
Original sample test set accuracy: 0.75270
Result:
loss: 0.16096
Training set accuracy: 0.95075
Adversarial sample test set accuracy: 0.96594
Original sample test set accuracy: 0.77880
Result:
Test set clean sample accuracy: 0.81610
Test set adversarial sample accuracy: 0.94066