Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature extraction - ESM embeddings #13

Merged
merged 13 commits into from
Jul 26, 2023
Merged

Feature extraction - ESM embeddings #13

merged 13 commits into from
Jul 26, 2023

Commits on Jul 25, 2023

  1. test(run): esm tests

    jyaacoub committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    eae44be View commit details
    Browse the repository at this point in the history
  2. fix(dataset): -esm +prot_seq #8

    Storing Esm embeddings wont work since they are 320-d vectors PER amino acid...
    
    Instead the better approach is to just store the sequence strings and leave the esm emb calc to be done on the model side of things.
    
    See #8 for more
    jyaacoub committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    40f40ad View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2023

  1. fix(models): esm embedding

    Esm embedding now works
    jyaacoub committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    33069a6 View commit details
    Browse the repository at this point in the history
  2. test(results): kiba run init results

    Not too great, outliers ruining training?
    jyaacoub committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    db0d963 View commit details
    Browse the repository at this point in the history
  3. fix(train_test): og_model_opt t/f -> string input

    This has to be done since argparse doesnt handle boolean choices well through the cli.
    jyaacoub committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    3469618 View commit details
    Browse the repository at this point in the history
  4. Merge branch 'feature_extraction' of https://github.com/jyaacoub/MutDTA

    … into feature_extraction
    jyaacoub committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    2cbfbb1 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    f13b71e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    34155b6 View commit details
    Browse the repository at this point in the history
  7. Merge branch 'feature_extraction' of https://github.com/jyaacoub/MutDTA

    … into feature_extraction
    jyaacoub committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    519ac8e View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    f0761b6 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    7a1f64b View commit details
    Browse the repository at this point in the history
  10. perf: optimize memory usage during training

    Main thing is to freeze ESM layers since its too large to train.
    jyaacoub committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    8038746 View commit details
    Browse the repository at this point in the history
  11. chore: update logging

    This is useful since epochs would be cluttered with transformer tokenizer warning logs
    jyaacoub committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    c34897f View commit details
    Browse the repository at this point in the history