Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pharokka stuck at running mmseqs search #325

Open
quocviet0908 opened this issue Jan 19, 2024 · 7 comments
Open

Pharokka stuck at running mmseqs search #325

quocviet0908 opened this issue Jan 19, 2024 · 7 comments

Comments

@quocviet0908
Copy link

quocviet0908 commented Jan 19, 2024

  • pharokka version: 1.6.0
  • Python version: 3.10.8
  • Operating System: Ubuntu 20.04

Hello,

Pharokka is very promising for annotating viral genomes. However, when I tried to run it with my data the program always stuck at this step:

"2024-01-19 15:15:21.280 | INFO | external_tools:run:50 - Started running mmseqs search -e 1E-05 /home/quocviet/Work/Extra/pharokka_v1.4.0_databases/phrogs_profile_db pharokka_20240119/target_dir/target_seqs pharokka_20240119/mmseqs/results_mmseqs pharokka_20240119/tmp_dir/ -s 8.5 --threads 1 ..."

The installation seems fine without showing any obvious error (I installed pharokka via mamba).

The command I used:
time pharokka.py -i mydata.fasta -o pharokka_20240119 -d /home/quocviet/Work/Extra/pharokka_v1.4.0_databases -f

Could you please help me with this? Thank you.

@gbouras13
Copy link
Owner

Hi @quocviet0908 ,

What version of mmseqs2 is installed? v13.45111? If not, that would be the cause of this issue. Please install it with:

mamba install mmseqs2==13.45111

If yes, then maybe you should give pharokka more threads (not 1) - e.g. -t 16 or -t 8.

If you are annotating one or a only a few phages, try --fast as well.

George

@quocviet0908
Copy link
Author

Hi @quocviet0908 ,

What version of mmseqs2 is installed? v13.45111? If not, that would be the cause of this issue. Please install it with:

mamba install mmseqs2==13.45111

If yes, then maybe you should give pharokka more threads (not 1) - e.g. -t 16 or -t 8.

If you are annotating one or a only a few phages, try --fast as well.

George

Hi George.

Thank you for your quick response. The version of mmseqs2 is v13.45111.

I tried to use --fast option to bypass MMseqs2 and the program ran smoothly, but I will not be able to get CARD or VFDB annotations.

I think the problem is somehow related to mmseqs2 but I'm not sure though.

@gbouras13
Copy link
Owner

Maybe upload the log file and I will try and see what the issue is.

George

@quocviet0908
Copy link
Author

Maybe upload the log file and I will try and see what the issue is.

George

Hi George,

Please see my attachment. This is the log when the command is stuck at that step.
logs.zip

Thank you very much.

@iaindhay
Copy link

iaindhay commented Feb 6, 2024

I'm seeing the same (i think).
exits after calling mmseqs

2024-02-07 11:36:24.996 | INFO     | external_tools:run:50 - Started running mmseqs search -e 1E-05 /db/pharokka/phrogs_profile_db pharokka_terL/target_dir/target_seqs pharokka_terL/mmseqs/results_mmseqs pharokka_terL/tmp_dir/ -s 8.5 --threads 24 ...
2024-02-07 11:36:25.041 | ERROR    | external_tools:run_tool:94 - Error calling mmseqs search -e 1E-05 /db/pharokka/phrogs_profile_db pharokka_terL/target_dir/target_seqs pharokka_terL/mmseqs/results_mmseqs pharokka_terL/tmp_dir/ -s 8.5 --threads 24 (return code 1)

V 1.6.1; up to date db; mmseqs2=13.45111
Runs fine with --fast flag

Edit/

from mmseqs_search_XXXX.err

Could not create symlink of pharokka_terL/tmp_dir//6144855635082578743!
Command line: mmseqs search -e 1E-05 /db/pharokka/phrogs_profile_db pharokka_terL/target_dir/target_seqs pharokka_terL/mmseqs/results_mmseqs pharokka_terL/tmp_dir/ -s 8.5 --threads 24

Thanks

@gbouras13
Copy link
Owner

Hi @iaindhay ,

Interesting - looks potentially like a permissions issue on the system you are running or an issue with space (looking at the MMSeqs2 issues e.g. soedinglab/MMseqs2#171 )

George

@iaindhay
Copy link

iaindhay commented Feb 7, 2024

Thanks George. Yes i just realized it an issue with creating symbolic links in that drive. Issue on my end.

I have one error like #300 where i see the annotation stop during the post processing steps with the ValueError: Columns must be same length as key. strangely this is only with one of my genomes. Only it looks to be during the PHROGs post processing not the VFDB as in #300

2024-02-07 12:20:17.410 | INFO     | external_tools:run:52 - Done running mmseqs search --min-seq-id 0.8 -c 0.4 /db/pharokka/vfdb /1TB/phage/ar1/VFDB_target_dir/target_seqs /1TB/phage/ar1/VFDB/results_mmseqs /1TB/phage/ar1/VFDB_dir/ -s 8.5 --threads 24
2024-02-07 12:20:17.411 | INFO     | external_tools:run:50 - Started running mmseqs createtsv /db/pharokka/vfdb /1TB/phage/ar1/VFDB_target_dir/target_seqs /1TB/phage/ar1/VFDB/results_mmseqs /1TB/phage/ar1/vfdb_results.tsv --full-header --threads 24 ...
2024-02-07 12:20:17.440 | INFO     | external_tools:run:52 - Done running mmseqs createtsv /db/pharokka/vfdb /1TB/phage/ar1/VFDB_target_dir/target_seqs /1TB/phage/ar1/VFDB/results_mmseqs /1TB/phage/ar1/vfdb_results.tsv --full-header --threads 24
2024-02-07 12:20:17.440 | INFO     | __main__:main:363 - Running PyHMMER on PHROGs.
2024-02-07 12:20:25.627 | INFO     | __main__:main:379 - Post Processing Output.
2024-02-07 12:20:25.649 | INFO     | post_processing:create_mmseqs_tophits:2104 - Processing MMseqs2 outputs.
2024-02-07 12:20:25.650 | INFO     | post_processing:create_mmseqs_tophits:2105 - Processing PHROGs output.
Traceback (most recent call last):
  File "/miniconda3/envs/pharokka/bin/pharokka.py", line 499, in <module>
    main()
  File "/miniconda3/envs/pharokka/bin/pharokka.py", line 418, in main
    pharok.process_results()
  File "/miniconda3/envs/pharokka/bin/post_processing.py", line 242, in process_results
    merged_df[["mmseqs_phrog", "mmseqs_top_hit"]] = merged_df[
  File "/miniconda3/envs/pharokka/lib/python3.10/site-packages/pandas/core/frame.py", line 4287, in __setitem__
    self._setitem_array(key, value)
  File "/miniconda3/envs/pharokka/lib/python3.10/site-packages/pandas/core/frame.py", line 4329, in _setitem_array
    check_key_length(self.columns, key, value)
  File "/miniconda3/envs/pharokka/lib/python3.10/site-packages/pandas/core/indexers/utils.py", line 390, in check_key_length
    raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants