Self-supervised learning is a method of utilizing unsupervised data as supervised by extracting supervision from the data itself.
2 tasks:
- Pretext task == self-supervised task: artificial task for learning representation.
- Downstream task: main task
We don't really care about performance on pretext task
Any studies on how performance on the pretext task affects performance on doewnstream task
Difference to generative models is in their goals^
- generative: generate well, representative
- self-supervised: find useful features
Semi-supervised learning is the type of machine learning that uses a small amount of labelled data (LD) and a larger amount of unlabelled data (UD). 2 main strategies:
- self-training: LD -> UD train a model on LD; use it to create labels for UD; train a new model on LD and UD with fake labels 1, 2
- self-supervised learning: UD -> LD use UD to extract useful features; then train a model, utilizing those features, on LD
1 more sophisticated options: use only mose confident UD labels in the third step
2 EfficientNet was trained that way according to [1] (sec 4.3)
Types of self-supervised models [1] (sec 3)
- Generative models (autoencoders, flow-based, auto-regressive)
- Contrastive models (i.e. discriminative models)
- Generative-contrastive (= adversarial) models
[1] Liu et al. 2020 Self-supervised Learning: Generative or Contrastive arxiv pdf