-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLAVA]Change some initialization orders and corresponding tests #105
Conversation
[ghstack-poisoned]
ghstack-source-id: fb5a658cb28155d50f219186b798511fb0060bdb Pull Request resolved: #105
Codecov Report
@@ Coverage Diff @@
## gh/ankitade/4/base #105 +/- ##
=====================================================
Coverage ? 93.00%
=====================================================
Files ? 47
Lines ? 2758
Branches ? 0
=====================================================
Hits ? 2565
Misses ? 193
Partials ? 0 Continue to review full report at Codecov.
|
[ghstack-poisoned]
ghstack-source-id: 3cf30ed0417a4f003cd812b8ba2468b4be2868fe Pull Request resolved: #105
[ghstack-poisoned]
ghstack-source-id: 844aba79602cca0c4f34c3c0ad88cdea72f3f8af Pull Request resolved: #105
[ghstack-poisoned]
ghstack-source-id: 39675e4ac6f9c93c451b0bb5f72e153737c2fee5 Pull Request resolved: #105
[ghstack-poisoned]
[ghstack-poisoned]
ghstack-source-id: 78d48bd01eeab7c262c570c90793d84e663d2610 Pull Request resolved: #105
[ghstack-poisoned]
ghstack-source-id: a70021ffe4adb6f995bbe5db0f7fe45b8f7c3f0d Pull Request resolved: #105
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
ghstack-source-id: 97022557a105c82874eba11bb28fa0dcfa632585 Pull Request resolved: #105
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
5 similar comments
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Test plan
pytest
Stack from ghstack (oldest at bottom):
Differential Revision: D37466221