Cannot run drug-target interaction example. 

I have the first example in the example usage section working. The code of the the second example produces an assertion error and is unable to infer task type during the construction of the TrasnsformerLanguageModelPrompt. 

I have downloaded and have accessible the pre-trained model BioGPT-RE-DTI. 

The directory "data/KD-DTI/relis-bin" wasn't present. I've found a reference in [issue #87](https://github.com/microsoft/BioGPT/issues/87
) that suggests that this needs to be created by running the preprocess.sh script in examples/RE-DTI. 

I've done that, which created the KD-DTI/relis directory. Running this code in a file called biogptTest.py:
```
import torch
from src.transformer_lm_prompt import TransformerLanguageModelPrompt
m = TransformerLanguageModelPrompt.from_pretrained(
        "checkpoints/RE-DTI-BioGPT", 
        "checkpoint_avg.pt", 
        "data/KD-DTI/relis-bin",
        tokenizer='moses', 
        bpe='fastbpe', 
        bpe_codes="data/bpecodes",
        max_len_b=1024,
        beam=1)
```
now generates this Assertion error: 
```
$ python biogptTest.py 
2024-01-12 13:40:17 | INFO | fairseq.file_utils | loading archive file /data/projects/biogpt/data/checkpoints/RE-DTI-BioGPT
2024-01-12 13:40:17 | INFO | fairseq.file_utils | loading archive file /data/projects/biogpt/data/KD-DTI/relis-bin
Traceback (most recent call last):
  File "/data/projects/biogpt/src/biogptTest.py", line 3, in <module>
    m = TransformerLanguageModelPrompt.from_pretrained(
  File "/data/projects/classifiers/bin/envs/biogpt/lib/python3.10/site-packages/fairseq/models/fairseq_model.py", line 267, in from_pretrained
    x = hub_utils.from_pretrained(
  File "/data/projects/classifiers/bin/envs/biogpt/lib/python3.10/site-packages/fairseq/hub_utils.py", line 73, in from_pretrained
    models, args, task = checkpoint_utils.load_model_ensemble_and_task(
  File "/data/projects/classifiers/bin/envs/biogpt/lib/python3.10/site-packages/fairseq/checkpoint_utils.py", line 432, in load_model_ensemble_and_task
    task = tasks.setup_task(cfg.task)
  File "/data/projects/classifiers/bin/envs/biogpt/lib/python3.10/site-packages/fairseq/tasks/__init__.py", line 43, in setup_task
    task is not None
AssertionError: Could not infer task type from {'_name': 'language_modeling_prompt', 'data': '/data/projects/biogpt/data/KD-DTI/relis-bin', 'sample_break_mode': 'none', 'tokens_per_sample': 1024, 'output_dictionary_size': -1, 'self_target': False, 'future_target': False, 'past_target': False, 'add_bos_token': False, 'max_target_positions': 1024, 'shorten_method': 'none', 'shorten_data_split_list': '', 'pad_to_fixed_length': False, 'pad_to_fixed_bsz': False, 'seed': 1, 'batch_size': None, 'batch_size_valid': None, 'dataset_impl': None, 'data_buffer_size': 10, 'tpu': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma', 'source_lang': None, 'target_lang': None, 'max_source_positions': 640, 'manual_prompt': None, 'learned_prompt': 9, 'learned_prompt_pattern': 'learned', 'prefix': False, 'sep_token': '<seqsep>'}. Available argparse tasks: dict_keys(['multilingual_language_modeling', 'language_modeling', 'masked_lm', 'speech_to_text', 'translation', 'simul_speech_to_text', 'simul_text_to_text', 'cross_lingual_lm', 'audio_pretraining', 'audio_finetuning', 'denoising', 'legacy_masked_lm', 'text_to_speech', 'frm_text_to_speech', 'translation_from_pretrained_xlm', 'speech_unit_modeling', 'translation_lev', 'multilingual_masked_lm', 'sentence_prediction', 'sentence_prediction_adapters', 'translation_multi_simple_epoch', 'sentence_ranking', 'translation_from_pretrained_bart', 'hubert_pretraining', 'online_backtranslation', 'multilingual_denoising', 'multilingual_translation', 'semisupervised_translation', 'speech_to_speech', 'dummy_lm', 'dummy_masked_lm', 'dummy_mt']). Available hydra tasks: dict_keys(['multilingual_language_modeling', 'language_modeling', 'masked_lm', 'translation', 'simul_text_to_text', 'audio_pretraining', 'audio_finetuning', 'translation_from_pretrained_xlm', 'speech_unit_modeling', 'translation_lev', 'sentence_prediction', 'sentence_prediction_adapters', 'hubert_pretraining', 'dummy_lm', 'dummy_masked_lm'])
```
The error is suggesting that we should be defining a task type here, but I can't see any reference to that in the github pages or issues, and I don't see any of the available arguments that the AssertionError suggests as being particularly relevant to the identification of drug target interactions? 

The only other thing that I can think of, is there something else that needs to be done in the RE-DTI directory? Do we have to run the training and/or the validation scripts? 

Thanks
Ben. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cannot run drug-target interaction example. #120

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cannot run drug-target interaction example. #120

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions