Skip to content

Commit ebad963

Browse files
author
Alexey Gavryushin
committed
Initial commit
0 parents  commit ebad963

26 files changed

+4361
-0
lines changed

README.md

+52
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# deep-clustering-recombination-framework
2+
3+
Recombination framework for deep-learning–based clustering methods, based on "An ontology for systematization and recombination of deep-learning–based clustering methods".
4+
5+
## How to use
6+
7+
1. Configure one or more deep-learning–⁠based clustering methods using JSON files following the JSON schema found in ``schema/method.json``. Instructions can be found in the ``description`` properties of the respective JSON properties and objects in the schema files. For definitions of terms found therein and further elaborations, see the aforementioned ontology.
8+
2. Run ``main.py`` with the configuration files of the methods to process as arguments (all paths are considered to be relative to the location of ``main.py``). Any directories passed as arguments will be searched non-recursively, and any found JSON files will be processed.
9+
10+
## Arguments for ``main.py``
11+
``--one_log_file``: path to single log file to use (uses separate log files for each method to process if omitted)
12+
13+
``--no_log_files``: do not create log files
14+
15+
``--no_log_timestamps``: do not prefix log messages with timestamps
16+
17+
``--resume_on_error``: if an exception is raised while processing a method, process the next method instead of crashing
18+
19+
## Cluster assignment strategies
20+
21+
The following cluster assignment strategies are currently implemented:
22+
* based on output of sample-space classifier
23+
* based on output of feature-space classifier
24+
* classical clustering in feature space after training (using k-means)
25+
* based on feature-space centroids calculated during training (using soft assignments)
26+
27+
## Trainable mappings
28+
29+
In general, trainable mappings with layers from any class in the ``torch.nn`` module of the ``torch`` Python package can be constructed.
30+
31+
## Design patterns
32+
JSON schema definitions of design patterns, as described in the ontology, can be found in ``schema/design_patterns``. Currently, the following design patterns are implemented:
33+
* training a feature extractor through reconstruction of samples
34+
* training a feature extractor by using adversarial interpolation
35+
* facilitating the training of a feature extractor by using layer-wise pretraining (variant using denoising autoencoder)
36+
* learning transformation-invariant feature representations by using contrastive learning and data augmentation (variant using SimCLR)
37+
* learning invariance of soft assignments to transformations by using assignment statistics vectors and data augmentation
38+
* encouraging cluster formation by minimizing the divergence between the current cluster assignment distribution and a derived target distribution
39+
* encouraging cluster formation by reinforcing the current assignment of samples to clusters (variants in feature space, in sample space using a decoder, and based on soft assignments)
40+
* preventing cluster degeneracy by maximizing the entropy of soft assignments
41+
42+
## Datasets
43+
44+
Note that the recombination framework is currently limited to processing datasets in the ``torchvision.datasets`` module of the ``torchvision`` Python package.
45+
46+
## Exemplary method configurations
47+
Some configurations of deep-learning–based clustering methods, as discussed in "An ontology for systematization and recombination of deep-learning–based clustering methods", can be found in subdirectories of the ``configurations`` directory. Details can be found in the respective method's configuration file.
48+
* ``dec_cc_hybrid``: This method uses the standard ``784-500-500-2000-10`` encoder (introduced in https://arxiv.org/abs/1511.06335) as its feature extractor. It computes soft assignments based on a Student's-t–kernel measuring the similarity between feature representations and feature-space centroids as its cluster assignment strategy. Furthermore, it uses the "training a feature extractor through reconstruction of samples" design pattern during its pretraining phase, and the "learning transformation-invariant feature representations by using contrastive learning and data augmentation", "learning invariance of soft assignments to transformations by using assignment statistics vectors and data augmentation", "preventing cluster degeneracy by maximizing the entropy of soft assignments" (all three from the method in https://arxiv.org/abs/2009.09687), and "encouraging cluster formation by minimizing the divergence between the current cluster assignment distribution and a derived target distribution" design patterns during its finetuning phase.\
49+
Achieves a performance of ACC 0.978 (97.8 ± 0.2), NMI 0.944 (94.4 ± 1.0), ARI 0.952 (95.2 ± 0.4) evaluated on MNIST-Test after training on MNIST-Train. Further achieves a performance of ACC 0.583 (58.3 ± 1.5), NMI 0.633 (63.3 ± 0.6), ARI 0.470 (47.0 ± 1.5) on Fashion-MNIST-Test after training on Fashion-MNIST-Train. The values in parentheses indicate means and standard deviations over 10 runs. Configuration files are provided both for evaluation on MNIST-Test and on Fashion-MNIST-Test.
50+
* ``deep_k_means``: Recreation of the DKM-a method (introduced in https://arxiv.org/abs/1806.10069) on MNIST. This method uses the aforementioned ``784-500-500-2000-10`` encoder as its feature extractor. It computes soft assignments based on a Gaussian kernel measuring the similarity between feature representations and feature-space centroids as its cluster assignment strategy, using an annealed inverse temperature as described in https://arxiv.org/abs/1806.10069. Furthermore, it uses the feature-space–based variant of the "encouraging cluster formation by reinforcing the current assignment of samples to clusters" design pattern, as well as the "training a feature extractor through reconstruction of samples" design pattern during its single training phase.
51+
* ``layer_wise_pretraining``: A method trained using solely the denoising-autoencoder–based variant of the "facilitating the training of a feature extractor by using layer-wise pretraining" design pattern. The method uses the "classical clustering in feature space after training" cluster assignment strategy, using the aforementioned ``784-500-500-2000-10`` encoder as its feature extractor. Intended to test the correct implementation of the "facilitating the training of a feature extractor by using layer-wise pretraining" design pattern, and can be used as a basis for the construction of a method following a pretraining-finetuning training schedule.
52+
* ``adversarial_interpolation_pretraining``: A method trained using solely the "training a feature extractor by using adversarial interpolation" design pattern, using the aforementioned ``784-500-500-2000-10`` encoder as its feature extractor. The method uses the "classical clustering in feature space after training" cluster assignment strategy. Intended to test the correct implementation of the "training a feature extractor by using adversarial interpolation" design pattern, and can be used as a basis for the construction of a method following a pretraining-finetuning training schedule.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
{
2+
"$schema": "../../schema/method.json",
3+
"name": "adversarial_interpolation_pretraining",
4+
"mappings": [
5+
{
6+
"name": "encoder",
7+
"type": "feature_extractor",
8+
"layers": [
9+
{ "name": "enc_flatten_1", "type": "Flatten" },
10+
{ "name": "enc_linear_2", "type": "Linear", "in_features": 784, "out_features": 500 },
11+
{ "name": "enc_relu_3", "type": "ReLU" },
12+
{ "name": "enc_linear_4", "type": "Linear", "in_features": 500, "out_features": 500 },
13+
{ "name": "enc_relu_5", "type": "ReLU" },
14+
{ "name": "enc_linear_6", "type": "Linear", "in_features": 500, "out_features": 2000 },
15+
{ "name": "enc_relu_7", "type": "ReLU" },
16+
{ "name": "enc_linear_8", "type": "Linear", "in_features": 2000, "out_features": 10 }
17+
]
18+
},
19+
{
20+
"name": "decoder",
21+
"type": "design_pattern_specific",
22+
"layers": [
23+
{ "name": "dec_linear_1", "type": "Linear", "in_features": 10, "out_features": 2000 },
24+
{ "name": "dec_relu_2", "type": "ReLU" },
25+
{ "name": "dec_linear_3", "type": "Linear", "in_features": 2000, "out_features": 500 },
26+
{ "name": "dec_relu_4", "type": "ReLU" },
27+
{ "name": "dec_linear_5", "type": "Linear", "in_features": 500, "out_features": 500 },
28+
{ "name": "dec_relu_6", "type": "ReLU" },
29+
{ "name": "dec_linear_7", "type": "Linear", "in_features": 500, "out_features": 784 },
30+
{ "name": "dec_unflatten_8", "type": "Unflatten", "dim": 1, "unflattened_size": [1, 28, 28] }
31+
]
32+
},
33+
{
34+
"name": "critic",
35+
"type": "design_pattern_specific",
36+
"layers": [
37+
{ "type": "Flatten" },
38+
{ "type": "Linear", "in_features": 784, "out_features": 500 },
39+
{ "type": "ReLU" },
40+
{ "type": "Linear", "in_features": 500, "out_features": 500 },
41+
{ "type": "ReLU" },
42+
{ "type": "Linear", "in_features": 500, "out_features": 2000 },
43+
{ "type": "ReLU" },
44+
{ "type": "Linear", "in_features": 2000, "out_features": 10 },
45+
{ "type": "Unflatten", "dim": 1, "unflattened_size": [1, 10] },
46+
{ "type": "AvgPool1d", "kernel_size": 10 },
47+
{ "type": "Flatten" }
48+
]
49+
}
50+
],
51+
"phases": [
52+
{
53+
"name": "pretraining",
54+
"order": 1,
55+
"exit_criteria": { "iterations": 50000 },
56+
"save_mapping_parameters": [
57+
{ "mapping_name": "encoder", "saving_interval": 1000, "path_to_file_or_dir": "pretrained/", "keep_old_files": true },
58+
{ "mapping_name": "decoder", "saving_interval": 1000, "path_to_file_or_dir": "pretrained/", "keep_old_files": true },
59+
{ "mapping_name": "critic", "saving_interval": 1000, "path_to_file_or_dir": "pretrained/", "keep_old_files": true }
60+
],
61+
"design_patterns": [
62+
{
63+
"pattern": "training_feature_extractor_through_reconstruction_of_samples",
64+
"encoder_name": "encoder",
65+
"decoder_name": "decoder",
66+
"loss_optimizer_group_name": "ae_optimizer_group",
67+
"loss_report_interval": 500
68+
},
69+
{
70+
"pattern": "training_feature_extractor_by_using_adversarial_interpolation",
71+
"encoder_name": "encoder",
72+
"decoder_name": "decoder",
73+
"critic_name": "critic",
74+
"autoencoder_loss_optimizer_group_name": "ae_optimizer_group",
75+
"critic_loss_optimizer_group_name": "critic_optimizer_group",
76+
"autoencoder_loss_weight": 0.5,
77+
"loss_report_interval": 500
78+
}
79+
],
80+
"optimizers": [
81+
{
82+
"type": "SGD",
83+
"group_name": "ae_optimizer_group",
84+
"lr": 0.001,
85+
"momentum": 0.9,
86+
"trained_mappings": ["encoder", "decoder"]
87+
},
88+
{
89+
"type": "SGD",
90+
"group_name": "critic_optimizer_group",
91+
"lr": 0.001,
92+
"momentum": 0.9,
93+
"trained_mappings": ["critic"]
94+
}
95+
]
96+
},
97+
{
98+
"name": "evaluation",
99+
"order": 2,
100+
"exit_criteria": { "iterations": 1 },
101+
"performance_evaluation_interval": 1
102+
}
103+
],
104+
"cluster_assignment_strategy": {
105+
"type": "feature_representation_centroid_similarity",
106+
"similarity_measure": "student_t",
107+
"use_centroids_during_phases": ["evaluation"],
108+
"centroid_initialization_strategy": "classical_clustering",
109+
"centroid_initialization_classical_clustering_method": "k_means",
110+
"centroid_recalculation_strategy": "fixed_centroids"
111+
},
112+
"datasets": [
113+
{
114+
"name": "MNIST-Train",
115+
"dataset": "MNIST",
116+
"root": "../../datasets/mnist",
117+
"train": true,
118+
"download": true,
119+
"batch_size": 256,
120+
"num_clusters": 10,
121+
"phases": ["pretraining"]
122+
},
123+
{
124+
"name": "MNIST-Test",
125+
"dataset": "MNIST",
126+
"root": "../../datasets/mnist",
127+
"train": false,
128+
"download": true,
129+
"batch_size": 256,
130+
"num_clusters": 10,
131+
"phases": ["evaluation"],
132+
"reinitialize_mappings": false
133+
}
134+
],
135+
"training_device": "cuda_if_available"
136+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
{
2+
"$schema": "../../schema/method.json",
3+
"name": "dec_cc_hybrid_fashion_mnist",
4+
"mappings": [
5+
{
6+
"name": "encoder",
7+
"type": "feature_extractor",
8+
"layers": [
9+
{ "name": "enc_flatten_1", "type": "Flatten" },
10+
{ "name": "enc_linear_2", "type": "Linear", "in_features": 784, "out_features": 500 },
11+
{ "name": "enc_relu_3", "type": "ReLU" },
12+
{ "name": "enc_linear_4", "type": "Linear", "in_features": 500, "out_features": 500 },
13+
{ "name": "enc_relu_5", "type": "ReLU" },
14+
{ "name": "enc_linear_6", "type": "Linear", "in_features": 500, "out_features": 2000 },
15+
{ "name": "enc_relu_7", "type": "ReLU" },
16+
{ "name": "enc_linear_8", "type": "Linear", "in_features": 2000, "out_features": 10 }
17+
]
18+
},
19+
{
20+
"name": "decoder",
21+
"type": "design_pattern_specific",
22+
"layers": [
23+
{ "name": "dec_linear_1", "type": "Linear", "in_features": 10, "out_features": 2000 },
24+
{ "name": "dec_relu_2", "type": "ReLU" },
25+
{ "name": "dec_linear_3", "type": "Linear", "in_features": 2000, "out_features": 500 },
26+
{ "name": "dec_relu_4", "type": "ReLU" },
27+
{ "name": "dec_linear_5", "type": "Linear", "in_features": 500, "out_features": 500 },
28+
{ "name": "dec_relu_6", "type": "ReLU" },
29+
{ "name": "dec_linear_7", "type": "Linear", "in_features": 500, "out_features": 784 },
30+
{ "name": "dec_unflatten_8", "type": "Unflatten", "dim": 1, "unflattened_size": [1, 28, 28] }
31+
]
32+
},
33+
{
34+
"name": "instance_level_contrastive_head",
35+
"type": "design_pattern_specific",
36+
"prior_mapping_name": "encoder",
37+
"layers": [
38+
{ "type": "Linear", "in_features": 10, "out_features": 512 },
39+
{ "type": "ReLU" },
40+
{ "type": "Linear", "in_features": 512, "out_features": 128 }
41+
]
42+
}
43+
],
44+
"phases": [
45+
{
46+
"name": "pretraining",
47+
"order": 1,
48+
"exit_criteria": { "iterations": 50000 },
49+
"save_mapping_parameters": [
50+
{ "mapping_name": "encoder", "saving_interval": 1000, "path_to_file_or_dir": "pretrained/", "keep_old_files": true },
51+
{ "mapping_name": "decoder", "saving_interval": 1000, "path_to_file_or_dir": "pretrained/", "keep_old_files": true }
52+
],
53+
"design_patterns": [
54+
{
55+
"pattern": "training_feature_extractor_through_reconstruction_of_samples",
56+
"encoder_name": "encoder",
57+
"decoder_name": "decoder",
58+
"loss_report_interval": 500
59+
}
60+
],
61+
"optimizers": [
62+
{
63+
"type": "SGD",
64+
"lr": 0.001,
65+
"momentum": 0.9,
66+
"trained_mappings": ["encoder", "decoder"]
67+
}
68+
]
69+
},
70+
{
71+
"name": "finetuning",
72+
"order": 2,
73+
"exit_criteria": { "iterations": 100000 },
74+
"performance_evaluation_interval": 500,
75+
"save_mapping_parameters": [
76+
{ "mapping_name": "encoder", "saving_interval": 1000, "path_to_file_or_dir": "./", "keep_old_files": true },
77+
{ "mapping_name": "instance_level_contrastive_head", "saving_interval": 1000, "path_to_file_or_dir": "./", "keep_old_files": true }
78+
],
79+
"save_centroids": { "path_to_file_or_dir": "./", "saving_interval": 1000, "keep_old_files": true },
80+
"design_patterns": [
81+
{
82+
"pattern": "learning_feature_representations_by_using_contrastive_learning_and_data_augmentation",
83+
"contrastive_learning_head_name": "instance_level_contrastive_head",
84+
"batch_augmentation_name_1": "batch_augmentation_1",
85+
"batch_augmentation_name_2": "batch_augmentation_2",
86+
"temperature_parameter": 0.5,
87+
"loss_report_interval": 500
88+
},
89+
{
90+
"pattern": "learning_invariance_to_transformations_by_using_assignment_statistics_vectors_and_data_augmentation",
91+
"batch_augmentation_name_1": "batch_augmentation_1",
92+
"batch_augmentation_name_2": "batch_augmentation_2",
93+
"temperature_parameter": 1.0,
94+
"loss_report_interval": 500
95+
},
96+
{
97+
"pattern": "encouraging_cluster_formation_by_minimizing_divergence_between_current_and_target_cluster_assignment_distribution",
98+
"loss_weight": 0.5,
99+
"target_distribution_recalculation_interval": 140,
100+
"loss_report_interval": 500
101+
},
102+
{
103+
"pattern": "preventing_cluster_degeneracy_by_maximizing_entropy_of_soft_assignments",
104+
"batch_augmentation_name": "batch_augmentation_1",
105+
"loss_report_interval": 500
106+
},
107+
{
108+
"pattern": "preventing_cluster_degeneracy_by_maximizing_entropy_of_soft_assignments",
109+
"batch_augmentation_name": "batch_augmentation_2",
110+
"loss_report_interval": 500
111+
}
112+
],
113+
"optimizers": [
114+
{
115+
"type": "SGD",
116+
"lr": 0.001,
117+
"momentum": 0.9,
118+
"trained_mappings": ["encoder", "instance_level_contrastive_head"],
119+
"optimizes_centroids": true
120+
}
121+
]
122+
},
123+
{
124+
"name": "evaluation",
125+
"order": 3,
126+
"exit_criteria": { "iterations": 1 },
127+
"performance_evaluation_interval": 1
128+
}
129+
],
130+
"cluster_assignment_strategy": {
131+
"type": "feature_representation_centroid_similarity",
132+
"similarity_measure": "student_t",
133+
"use_centroids_during_phases": ["finetuning", "evaluation"],
134+
"centroid_initialization_strategy": "classical_clustering",
135+
"centroid_initialization_classical_clustering_method": "k_means",
136+
"centroid_recalculation_strategy": "recalculation_by_design_pattern"
137+
},
138+
"datasets": [
139+
{
140+
"name": "Fashion-MNIST-Train",
141+
"dataset": "FashionMNIST",
142+
"root": "../../datasets/fashion_mnist",
143+
"train": true,
144+
"download": true,
145+
"batch_size": 256,
146+
"num_clusters": 10,
147+
"phases": ["pretraining", "finetuning"],
148+
"batch_augmentations": [
149+
{
150+
"name": "batch_augmentation_1",
151+
"transforms": [
152+
{ "type": "RandomPerspective", "distortion_scale": 0.2, "p": 1.0 },
153+
{ "type": "RandomAffine", "degrees": 0, "translate": [0.1, 0.1], "scale": [0.75, 1.25] },
154+
{ "type": "ColorJitter", "brightness": [0.7, 1.15], "contrast": [0.85, 1.15] },
155+
{ "type": "ToTensor" }
156+
]
157+
},
158+
{
159+
"name": "batch_augmentation_2",
160+
"transforms": [
161+
{ "type": "RandomPerspective", "distortion_scale": 0.2, "p": 1.0 },
162+
{ "type": "RandomAffine", "degrees": 0, "translate": [0.1, 0.1], "scale": [0.75, 1.25] },
163+
{ "type": "ColorJitter", "brightness": [0.7, 1.15], "contrast": [0.85, 1.15] },
164+
{ "type": "ToTensor" }
165+
]
166+
}
167+
]
168+
},
169+
{
170+
"name": "Fashion-MNIST-Test",
171+
"dataset": "FashionMNIST",
172+
"root": "../../datasets/fashion_mnist",
173+
"train": false,
174+
"download": true,
175+
"batch_size": 256,
176+
"num_clusters": 10,
177+
"phases": ["evaluation"],
178+
"reinitialize_mappings": false
179+
}
180+
],
181+
"training_device": "cuda_if_available"
182+
}

0 commit comments

Comments
 (0)