Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot run drug-target interaction example. #120

Open
tirohia opened this issue Jan 12, 2024 · 0 comments
Open

Cannot run drug-target interaction example. #120

tirohia opened this issue Jan 12, 2024 · 0 comments

Comments

@tirohia
Copy link

tirohia commented Jan 12, 2024

I have the first example in the example usage section working. The code of the the second example produces an assertion error and is unable to infer task type during the construction of the TrasnsformerLanguageModelPrompt.

I have downloaded and have accessible the pre-trained model BioGPT-RE-DTI.

The directory "data/KD-DTI/relis-bin" wasn't present. I've found a reference in issue #87 that suggests that this needs to be created by running the preprocess.sh script in examples/RE-DTI.

I've done that, which created the KD-DTI/relis directory. Running this code in a file called biogptTest.py:

import torch
from src.transformer_lm_prompt import TransformerLanguageModelPrompt
m = TransformerLanguageModelPrompt.from_pretrained(
        "checkpoints/RE-DTI-BioGPT", 
        "checkpoint_avg.pt", 
        "data/KD-DTI/relis-bin",
        tokenizer='moses', 
        bpe='fastbpe', 
        bpe_codes="data/bpecodes",
        max_len_b=1024,
        beam=1)

now generates this Assertion error:

$ python biogptTest.py 
2024-01-12 13:40:17 | INFO | fairseq.file_utils | loading archive file /data/projects/biogpt/data/checkpoints/RE-DTI-BioGPT
2024-01-12 13:40:17 | INFO | fairseq.file_utils | loading archive file /data/projects/biogpt/data/KD-DTI/relis-bin
Traceback (most recent call last):
  File "/data/projects/biogpt/src/biogptTest.py", line 3, in <module>
    m = TransformerLanguageModelPrompt.from_pretrained(
  File "/data/projects/classifiers/bin/envs/biogpt/lib/python3.10/site-packages/fairseq/models/fairseq_model.py", line 267, in from_pretrained
    x = hub_utils.from_pretrained(
  File "/data/projects/classifiers/bin/envs/biogpt/lib/python3.10/site-packages/fairseq/hub_utils.py", line 73, in from_pretrained
    models, args, task = checkpoint_utils.load_model_ensemble_and_task(
  File "/data/projects/classifiers/bin/envs/biogpt/lib/python3.10/site-packages/fairseq/checkpoint_utils.py", line 432, in load_model_ensemble_and_task
    task = tasks.setup_task(cfg.task)
  File "/data/projects/classifiers/bin/envs/biogpt/lib/python3.10/site-packages/fairseq/tasks/__init__.py", line 43, in setup_task
    task is not None
AssertionError: Could not infer task type from {'_name': 'language_modeling_prompt', 'data': '/data/projects/biogpt/data/KD-DTI/relis-bin', 'sample_break_mode': 'none', 'tokens_per_sample': 1024, 'output_dictionary_size': -1, 'self_target': False, 'future_target': False, 'past_target': False, 'add_bos_token': False, 'max_target_positions': 1024, 'shorten_method': 'none', 'shorten_data_split_list': '', 'pad_to_fixed_length': False, 'pad_to_fixed_bsz': False, 'seed': 1, 'batch_size': None, 'batch_size_valid': None, 'dataset_impl': None, 'data_buffer_size': 10, 'tpu': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma', 'source_lang': None, 'target_lang': None, 'max_source_positions': 640, 'manual_prompt': None, 'learned_prompt': 9, 'learned_prompt_pattern': 'learned', 'prefix': False, 'sep_token': '<seqsep>'}. Available argparse tasks: dict_keys(['multilingual_language_modeling', 'language_modeling', 'masked_lm', 'speech_to_text', 'translation', 'simul_speech_to_text', 'simul_text_to_text', 'cross_lingual_lm', 'audio_pretraining', 'audio_finetuning', 'denoising', 'legacy_masked_lm', 'text_to_speech', 'frm_text_to_speech', 'translation_from_pretrained_xlm', 'speech_unit_modeling', 'translation_lev', 'multilingual_masked_lm', 'sentence_prediction', 'sentence_prediction_adapters', 'translation_multi_simple_epoch', 'sentence_ranking', 'translation_from_pretrained_bart', 'hubert_pretraining', 'online_backtranslation', 'multilingual_denoising', 'multilingual_translation', 'semisupervised_translation', 'speech_to_speech', 'dummy_lm', 'dummy_masked_lm', 'dummy_mt']). Available hydra tasks: dict_keys(['multilingual_language_modeling', 'language_modeling', 'masked_lm', 'translation', 'simul_text_to_text', 'audio_pretraining', 'audio_finetuning', 'translation_from_pretrained_xlm', 'speech_unit_modeling', 'translation_lev', 'sentence_prediction', 'sentence_prediction_adapters', 'hubert_pretraining', 'dummy_lm', 'dummy_masked_lm'])

The error is suggesting that we should be defining a task type here, but I can't see any reference to that in the github pages or issues, and I don't see any of the available arguments that the AssertionError suggests as being particularly relevant to the identification of drug target interactions?

The only other thing that I can think of, is there something else that needs to be done in the RE-DTI directory? Do we have to run the training and/or the validation scripts?

Thanks
Ben.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant