-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General model experiments #3
Comments
ExperimentDuring training I sample cells from each well to create uniform tensors, this is a requirement for creating batches of data larger than 1. It is generally known in contrastive learning/metric learning that a large batch size is essential for good model performance. This is also something that I showed in the first issue: #1. HypothesisModel performance will be slightly improved by using all existing cells without extra sampling because the sampling may pick out cells that are not as representative of the perturbation. Main takeaways
From here on evaluation will be done without sampling, simply by collapsing all cells into a feature representation. |
Model capacity experiment 1Goal: Test if the current training method (model architecture, optimizer, and loss function) is capable of learning to distinguish point sets from each other while their means are the same. Background: The general setup is as follows:
To create the data:
Training process:
If it is able to complete this task, we can verify that this architecture is able to learn covariance matrices, and that most likely the feature aggregation model I am proposing is also able to do (or doing) this for the single-cell level feature data. Main takeaways
ResultsModel performanceTotal model mAP: 0.9589285714285715
Baseline (mean) performanceTotal baseline (mean) mAP: 0.3086709586709586
|
Model capacity experiment 2Continuing on the previous idea of learning higher order (i.e. covariance matrix and higher) statistics with the current model setup, this experiment aims to learn to repeat the previous one but on real data. More specifically, the Stain3 data is used as described here #6. I train using plates BR00115134_FS, BR00115125_FS, and BR00115133highexp_FS and validate on the rest. Note: This experiment is not ideal as I am using the data that is normalized to zero mean and unit variance per feature on the plate level. I then subsequently zero mean the data on the well level as well, meaning that taking the average profile is useless. I used this data because it is faster than preprocessing all of the data again. I am not too sure what this means for the standard deviation (or covariance matrix) on the well level. What do you think @shntnu ? ResultsTable Time!
|
My notes are below
^^^ This is fine
Perfect, because you've not scaled to unit variance (otherwise you'd be looking for structure in the correlation matrix, instead of the covariance matrix)
That worked out well! I've not peeked into the results but I am eagerly looking forward to your talk tomorrow where you might discuss more 🎉 |
Ok, I couldn't contain my excitement so I looked at the results :D I just picked one at random, and focused only on
This is fantastic, right?! |
Yes I think so too! I think this is proof that we can learn more than the mean. The mAP of the benchmark looks random (I believe it should be ~1/30, but the math around mAP is not as intuitive to me as the precision at K :) ). |
I didn't understand – you're already learning the covariance because you only mean subtract wells, right? |
Yes you're right. I mean seeing if we can learn third order interactions. Probably easiest if we discuss it tomorrow |
Ah by fixing you mean factoring out – got it For that, you'd spherize instead of mean subtracting That will be totally shocking it if works!! Even second order is pretty awesome (IMO, unless there's something trivial happening here https://broadinstitute.slack.com/archives/C025JFCBQAK/p1650466733918839?thread_ts=1649774854.681729&cid=C025JFCBQAK) |
PS – unless something super trivial is happening here that we haven't caught, I think you might be pretty close to having something you can write up. Let's get together with @johnarevalo for his advice, maybe next week |
Worth to give it a shot ;) Sounds good! |
For the toy data, also standardize after rotating and see what happens then. The idea is that we don’t yet know if it is learning covariance or just standard deviation |
Repeat of same experiment with standardized feature dimensionsBased on @shntnu's previous comment. Main takeawaysTotal model mAP: 0.93 Total baseline (mean) mAP: 0.33 The model is still beating the baseline (random) performance. mean Average Precision scoresTotal model mAP: 0.9282512626262627
Total baseline (mean) mAP: 0.32838989713989714
|
Great! And I verified, as sanity check, that the baseline hasn't changed (much) from before #3 (comment) The model results don't change much either (correlation vs covariance; details below) BTW, you show 10 ellipses but you have 16 rows in your results. Why is that? Correlation From the most recent results #3 (comment) Total model mAP: 0.9282512626262627 Covariance From the results 12 days ago #3 (comment) Total model mAP: 0.9589285714285715 |
Great, thanks for checking! There's 10 classes, but 4 samples (of 800 points each) of each class. The validation set consists of 4 classes, so 16 samples total. I report all samples individually here, normally I take the mean per class (compound). Interesting to note (perhaps for myself in the future): I had to reduce the learning rate by a factor of 100 (lr: 1e-5) to learn the correlation with the model adequately compared to learning the covariance (lr: 1e-3). |
Experiment 3 - Sphering the toy dataIn this experiment I sample 800*4 points using each covariance matrix class for the distribution, then I sphere this sample and subsequently subsample it to create pairs for training. I increase the number of epochs from 40 to 1000 as the model is not able to fit the data otherwise. Main takeaways
Regularization 0.01 - heavy spheringTotal model mAP: 0.2943837412587413 Total baseline (mean) mAP: 0.25055043336293337 full tablesModel
Baseline
Regularization 0.1 - medium spheringTotal model mAP: 0.6088789682539683 Total baseline (mean) mAP: 0.2500837125837126 full tablesModel
Baseline
Regularization 0.3 - low spheringTotal model mAP: 0.722172619047619 Total baseline (mean) mAP: 0.25495106745106744 full tablesModel
Baseline
|
Ah, this is expected (and thus, good!) because your data is almost surely fully explained by its second-order moments - because that's how you generated it – and sphering factors that out. The story is different with your real data – there, it will almost sure not be fully explained by second-order moments (although that doesn't mean you can do better)
Perfect! as expected, and it's great that you quantified it in terms of how much more complicated it is Note that medium / low sphering show be roughly equivalent to medium / low value for major/minor axis |
This issue is used to test more general aspects of model development not directly related to, but likely still influenced by, the dataset or model hyperparameters that are used at that time.
The text was updated successfully, but these errors were encountered: