Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alphabet error while annotating phage #331

Open
Fabian-Bastiaanssen opened this issue Feb 15, 2024 · 1 comment
Open

Alphabet error while annotating phage #331

Fabian-Bastiaanssen opened this issue Feb 15, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@Fabian-Bastiaanssen
Copy link

  • pharokka version:1.6.1
  • Python version:Python 3.10.8
  • Operating System:Ubuntu 22.04.2 LTS

Description

I was trying to annotate a circular phage with pharokka, the sequence only contains nucleotides and seems to be fine for other tools. I can't see any reasons why this might throw an error.
sequence_203.txt

What I Did

pharokka.py -f -i all_fastas/fastas/sequence_203.fasta -o all_fastas/sequence_203 -d /data/san/data0/databases/Pharokka/20230827 -t 4 -g prodigal-gv --skip_extra_annotations

2024-02-15 15:57:08.713 | INFO | main:main:363 - Running PyHMMER on PHROGs.
Traceback (most recent call last):
File "/data/san/data1/users/fabian/miniforge/envs/pharokka2/bin/pharokka.py", line 499, in
main()
File "/data/san/data1/users/fabian/miniforge/envs/pharokka2/bin/pharokka.py", line 364, in main
best_results_pyhmmer = run_pyhmmer(
File "/data/san/data1/users/fabian/miniforge/envs/pharokka2/bin/hmm.py", line 34, in run_pyhmmer
with pyhmmer.easel.SequenceFile(
File "pyhmmer/easel.pyx", line 6369, in pyhmmer.easel.SequenceFile.init
File "pyhmmer/easel.pyx", line 6363, in pyhmmer.easel.SequenceFile.init
ValueError: Could not determine alphabet of file: 'all_fastas/sequence_203/prodigal-gv_aas_tmp.fasta's

@gbouras13
Copy link
Owner

Hi @Fabian-Bastiaanssen ,

I can reproduce this error.

The reason is because the first CDS is

NDDDDDNDDDDNDDDDNDDDDNDND*

which is so repetitive/not-diverse and means pyhmmer can't infer whether it is a protein or nucleotide I am pretty sure and so crashes.

While I look into fixing the error, you can try one of two couple of things:

  1. Reroient your circular phage to begin elsewhere, which should hopefully work when you annotate that with pharokka (e.g. with Dnaapler https://github.com/gbouras13/dnaapler - you can do this by passing --dnaapler to pharokka )
  2. use --mmseqs2_only to disable pyhmmer

George

@gbouras13 gbouras13 added the bug Something isn't working label Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants