This repository is designed to reproduce the methods in some semi-supervised papers.
Before running the code, you need to install the packages according to the following command.
pip3 install torch==1.1.0
pip3 install torchvision==0.3.0
pip3 install tensorflow # we use tensorboard in the project
Use the following command to unpack the data and generate labeled data path files.
python3 -m semi_supervised.core.utils.cifar10
To reproduce the result in Temporal Ensembling for Semi-Supervised Learning, run
CUDA_VISIBLE_DEVICES=0 python3 -m semi_supervised.experiments.tempens.cifar10_test
CUDA_VISIBLE_DEVICES=0 python3 -m semi_supervised.experiments.pi.cifar10_test
To reproduce the result in Mean teachers are better role models. run
CUDA_VISIBLE_DEVICES=0 python3 -m semi_supervised.experiments.mean_teacher.cifar10_test
Note: This code does not be tested on multiple GPUs, so there is no guarantee that the result is satisfying when using multiple GPUs.
Number of Labeled Data | 1000 | 2000 | 4000 | All labels |
---|---|---|---|---|
Pi model (from SNTG) | 68.35 ± 1.20 | 82.43 ± 0.44 | 87.64 ± 0.31 | 94.44 ± 0.10 |
Pi model (this repository) | 69.615 ± 1.3013 | 82.92 ± 0.532 | 87.925 ± 0.227 | --- |
Tempens model (from SNTG) | 76.69 ± 1.01 | 84.36 ± 0.39 | 87.84 ± 0.24 | 94.4 ± 0.10 |
Tempens model (this repository) | 78.517 ± 1.1653 | 84.757 ± 0.42445 | 88.166 ± 0.24324 | 94.72 ± 0.14758 |
Mean Teacher (from Mean teachers) | 78.45 | 84.27 | 87.69 | 94.06 |
Mean Teacher (this repository) | 80.421 ± 1.0264 | 85.236 ± 0.655 | 88.435 ± 0.311 | 94.482 ± 0.1086 |
We report the mean and standard deviation of 10 runs using different random seeds(1000 - 1009).
In semi-supervised learning, many papers use common training strategies. This section introduces some strategies I know.
You can find out how to compute rampup_value and rampdown_value in semi_supervised/core/utils/fun_utils.py.
The curve of the learning rate is shown in the figure below.
Many methods in semi-supervised learning use Adam optimizer with beta1 = 0.9 and beta2 = 0.999. During training, beta1 is dynamically changed.
The curve of beta1 is shown in the figure below.
Some methods use dynamically changed weight to balance supervised loss and unsupervised loss.
The curve of consistency weight is shown in the figure below.
- Mean Teacher
- Pi Model
- Temporal Ensembling Model
- VAT
- More....