Code implementing the paper "Measuring the Intrinsic Dimension of Objective Landscapes".
I can reproduce, with minor variations, the overall results from the paper in Pytorch, and extend the research to a different projection matrix. As part of this work, I have provided the results obtained from this implementation and almost all the references (and resources that I find useful) utilized to complete it (see also the report in pdf for other references).
All the results follow the most similar implementation of the specific architecture shown in the paper. The projection used to obtain those results are the Dense and Fastfood ones (sparse projections give me some problems in Pytorch, see an issue in the reference) following the same rules of the paper.
As you can see from the plots there is no mean or std for the measured data. This is first because the released code is provided with seed (I really hope reproducible) and second, because providing those data means running all the computations at least three times and the computations of some of these architectures are very time-consuming and with limited computing resources at my disposal this is the best I can do.
All plots below can be investigated further by the code in id_plots.ipynb. All the training can be started in windows systems utilizing the *.ps1 or if you have a *nix system all the .ps1 can be easily translated to .sh as the one provided in the example, I have moved all those scripts in the folder "automation_scripts" so to start one of this one can do copy the desired script in the main directory and start it from this point or modify the paths inside the script. Here there is some code to perform this operation:
# execute these line from the main directory
cp automation_scripts/<name_of_script>.sh .
# modify the scripts to run the specific version of id90 desired or change hyperparameters
<name_of_script>.sh
Author Global | FC | 199210 | 750 |
Author Local | FC | 199210 | ? |
Global | FC | 199210 | 600 |
Local | FC | 199210 | 525 |
Author Global | LeNet | 44426 | 290 |
Author Local | LeNet | 44426 | 275 |
Global | LeNet | 44426 | 170 |
Local | LeNet | 44426 | 160 |
Author Global | Untied LeNet | 286334 | 600 |
Author Local | Untied LeNet | 286334 | 450 |
Global | Untied LeNet | 286334 | 350 |
Local | Untied LeNet | 286334 | 350 |
Author Global | FC LeNet | 3640574 | 2000 |
Author Local | FC LeNet | 3640574 | 1400 |
Global | FC LeNet | 3640574 | 900 |
Local | FC LeNet | 3640574 | 800 |
Author Global | FCTied LeNet | ? | 425 |
Author Local | FCTied LeNet | ? | 400 |
Global | FCTied LeNet | 193370 | 400 |
Local | FCTied LeNet | 193370 | 400 |
Author Global | FC | 1055610 | 9000 |
Author Local | FC | 1055610 | 8000 |
Global | FC | 1051930 | 5000 |
Local | FC | 1051930 | 10000 |
FastJL | FC | 1051930 | 4000 |
The results for this case are different due to problems of unnmatching dimension for the architecture (I tried various dimension and used the one that is more similar).
Author Global | LeNet | 62006 | 1000 |
Author Local | LeNet | 62006 | 2900 |
Global | LeNet | 62006 | 600 |
Local | LeNet | 62006 | 1700 |
FastJL | LeNet | 62006 | 1250 |
Author Global | Untied LeNet | 658238 | 9000? 2750* |
Author Local | Untied LeNet | 658238 | 15000 |
Global | Untied LeNet | 658238 | 10000 |
Local | Untied LeNet | 658238 | >40000 |
FastJL | Untied LeNet | 658238 | 2000 |
* in the paper they claim a
Unable to reproduce results, because my local implementation performs better than the global baseline of the paper so the baseline is very skewed and the 90% baseline is very difficult to reach with few parameters!
Little sparsification and variance problems after the drop in performance
Author Global | FC LeNet | 16397726 | 35000 |
Author Local | FC LeNet | 16397726 | >100000 |
Global | FC LeNet | 16397726 | 5000 |
Local | FC LeNet | 16397726 | 27000 |
FastJL | FC LeNet | 16397726 | 5000 |
Here the results are very different!
Author Global | FCTied LeNet | ? | 2500 |
Author Local | FCTied LeNet | ? | 4500 |
Global | FCTied LeNet | 297734 | 3000 |
Local | FCTied LeNet | 297734 | 8000 |
FastJL | FCTied LeNet | 297734 | 2500 |
Here the results of FastJL are very unstable the architecture reach the intrinsic dimension and then drop, I think because the sparsification and the difference in parameter's variance here play an important role.
Author Global | ResNet | 280000? | 1000-2000 |
Author Local | ResNet | 280000? | 20000-50000 |
Global | ResNet | 292954 | 1000 |
Local | ResNet | 292954 | 12000 |
FastJL | ResNet | 292954 | 1000 |
The implementation of this architecture is very hard to match so the results are a bit different from the paper!
- https://www.uber.com/en-IT/blog/intrinsic-dimension/
- https://tomroth.com.au/notes/intdim/intdim/
- https://github.com/uber-research/intrinsic-dimension
- https://twitter.com/JevGamper/status/1240335205807816705?s=20
- https://github.com/jgamper/intrinsic-dimensionality
- https://github.com/tnwei/intrinsic-dimension
- https://greydanus.github.io/2017/10/30/subspace-nn/
- https://discuss.pytorch.org/t/locally-connected-layers/26979/2
- https://www.cs.princeton.edu/~runzhey/demo/Geo-Intrinsic-Dimension.pdf
- https://github.com/LangLeon/thesis-intrinsic-dimension
- https://towardsdatascience.com/interesting-projections-where-pca-fails-fe64ddca73e6
- https://www.oreilly.com/library/view/hands-on-convolutional-neural/9781789130331/a33f17be-9d32-4499-aa1c-c1a81e023eb7.xhtml
- https://cs231n.github.io/convolutional-networks/#convert
- https://math.stackexchange.com/questions/995623/why-are-randomly-drawn-vectors-nearly-perpendicular-in-high-dimensions/995678
- https://www.youtube.com/watch?v=Y_Ac6KiQ1t0&t=1s
- https://github.com/josuni/Intrinsic-Dimensionality-for-various-datasets
- https://arxiv.org/pdf/1408.3060.pdf
- https://johnthickstun.com/docs/fast_jlt.pdf
- https://www.cs.technion.ac.il/~nailon/fjlt.pdf
- https://arxiv.org/pdf/2204.01800.pdf
- pytorch/pytorch#88053
- Using a beautiful projection like: https://arxiv.org/abs/1202.3033
- Try other datasets:
- FMNIST
- EMNIST
- FLOWERS102
- ImageNet
- Try other architectures:
- RNN
- LSTM
- Transformer
- Inception
- Implementing RL tasks
- Evaluate performace with Regularization and Dropout