New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Feature extraction - ESM embeddings #13

Merged

jyaacoub merged 13 commits into main from feature_extraction

Jul 26, 2023

Commits on Jul 25, 2023

test(run): esm tests

jyaacoub committed Jul 25, 2023
Configuration menu
View commit details

Copy full SHA for eae44be

Browse repository at this point
Copy the full SHA

eae44be View commit details

Browse the repository at this point in the history
fix(dataset): -esm +prot_seq #8
```
Storing Esm embeddings wont work since they are 320-d vectors PER amino acid...

Instead the better approach is to just store the sequence strings and leave the esm emb calc to be done on the model side of things.

See #8 for more
```
jyaacoub committed Jul 25, 2023
Configuration menu
View commit details

Copy full SHA for 40f40ad

Browse repository at this point
Copy the full SHA

40f40ad View commit details

Browse the repository at this point in the history

Commits on Jul 26, 2023

fix(models): esm embedding
```
Esm embedding now works
```
jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for 33069a6

Browse repository at this point
Copy the full SHA

33069a6 View commit details

Browse the repository at this point in the history
test(results): kiba run init results
```
Not too great, outliers ruining training?
```
jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for db0d963

Browse repository at this point
Copy the full SHA

db0d963 View commit details

Browse the repository at this point in the history
fix(train_test): og_model_opt t/f -> string input
```
This has to be done since argparse doesnt handle boolean choices well through the cli.
```
jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for 3469618

Browse repository at this point
Copy the full SHA

3469618 View commit details

Browse the repository at this point in the history
Merge branch 'feature_extraction' of https://github.com/jyaacoub/MutDTA …
```
… into feature_extraction
```
jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for 2cbfbb1

Browse repository at this point
Copy the full SHA

2cbfbb1 View commit details

Browse the repository at this point in the history
feat(mut_dta): add pro_emb_dim arg

jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for f13b71e

Browse repository at this point
Copy the full SHA

f13b71e View commit details

Browse the repository at this point in the history
test(train_test): add EsmDTA mdl to options

jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for 34155b6

Browse repository at this point
Copy the full SHA

34155b6 View commit details

Browse the repository at this point in the history
Merge branch 'feature_extraction' of https://github.com/jyaacoub/MutDTA …
```
… into feature_extraction
```
jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for 519ac8e

Browse repository at this point
Copy the full SHA

519ac8e View commit details

Browse the repository at this point in the history
fix(init_dataset): removed esm_only from args

jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for f0761b6

Browse repository at this point
Copy the full SHA

f0761b6 View commit details

Browse the repository at this point in the history
fix(train_test): import issues and wrong arg names

jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for 7a1f64b

Browse repository at this point
Copy the full SHA

7a1f64b View commit details

Browse the repository at this point in the history
perf: optimize memory usage during training
```
Main thing is to freeze ESM layers since its too large to train.
```
jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for 8038746

Browse repository at this point
Copy the full SHA

8038746 View commit details

Browse the repository at this point in the history
chore: update logging
```
This is useful since epochs would be cluttered with transformer tokenizer warning logs
```
jyaacoub committed Jul 26, 2023
Configuration menu
View commit details

Copy full SHA for c34897f

Browse repository at this point
Copy the full SHA

c34897f View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature extraction - ESM embeddings #13

Feature extraction - ESM embeddings #13

Commits on Jul 25, 2023

Commits on Jul 26, 2023