Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miso error #134

Closed
BulutHamali opened this issue May 7, 2024 · 10 comments
Closed

Miso error #134

BulutHamali opened this issue May 7, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@BulutHamali
Copy link

Description of the bug

I am constantly getting this error. It is already mentioned in other issues, but I was not able to fix it. Any help will be appreciated. Thanks. "
The exit status of the task that caused the workflow execution to fail was: 1

Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI (2)'

Caused by:
Process NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI (2) terminated with an error exit status (1)

Command executed:

sashimi_plot --plot-event ENSG00000005302.19 index miso_settings.txt --output-dir sashimi

cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI":
python: $(python --version | sed "s/Python //g")
misopy: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('misopy').version)")
END_VERSIONS

Command exit status:
1

Command output:
(empty)

Command error:
INFO: Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
/usr/local/lib/python2.7/site-packages/matplotlib/cbook/deprecation.py:107: MatplotlibDeprecationWarning: The mpl_toolkits.axes_grid module was deprecated in version 2.1. Use mpl_toolkits.axes_grid1 and mpl_toolkits.axisartist provies the same functionality instead.
warnings.warn(message, mplDeprecation, stacklevel=1)
Traceback (most recent call last):
File "/usr/local/bin/sashimi_plot", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/site-packages/misopy/sashimi_plot/sashimi_plot.py", line 276, in main
plot_label=plot_label)
File "/usr/local/lib/python2.7/site-packages/misopy/sashimi_plot/sashimi_plot.py", line 142, in plot_event
%(event_name, pickle_dir)
Exception: Event ENSG00000005302.19 not found in pickled directory index. Are you sure this is the right directory for the event?

Work dir:
/home/hamalibt/Splicesome_Project/RNA_SEQ_ARGLU1KO_VS_MCF7WT/work/6a/df958b424331c70ec5aaa5eeb10caf

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh"

Command used and terminal output

nextflow run nf-core/rnasplice \
    -profile singularity \
    --input /path/to/samplesheet.csv \
    --contrasts /path/to/contrastsheet.csv \
    --outdir /path/to/output_directory \
    --genome GRCh38 \
    --aligner star_salmon \
    --save_reference \
    --dexseq_dtu \
    --min_samps_gene_expr 6 \
    --min_gene_expr 10 \
    --min_samps_feature_expr 3 \
    --min_feature_expr 10 \
    --min_samps_feature_prop 3 \
    --min_feature_prop 0.1

Relevant files

nextflow.log

System information

N E X T F L O W ~ version 23.10.1, local, Apptainer

@BulutHamali BulutHamali added the bug Something isn't working label May 7, 2024
@jma1991
Copy link
Collaborator

jma1991 commented May 7, 2024

The error you're encountering likely stems from using a gene identifier that isn't included in your annotation. You've chosen GRCh38 as your genome parameter, which directs the workflow to pull the necessary annotation files from the iGenomes repository. You can verify this by downloading the GTF file linked in the configuration and checking for your gene identifier—you'll find it's missing. In fact, the GRCh38 annotation from iGenomes doesn't include Ensembl identifiers, a known issue across all nf-core workflows that has been discussed before. If you need to use the GRCh38 genome, I recommend providing your own FASTA and GTF files for the annotation.

@BulutHamali
Copy link
Author

Thank you very much. I later on used this command "nextflow run nf-core/rnasplice
-profile singularity
--input /path/to/samplesheet.csv
--contrasts /path/to/contrastsheet.csv
--outdir /path/to/output_directory
--genome GRCh38
--aligner star_salmon
--save_reference
--dexseq_dtu
--min_samps_gene_expr 6
--min_gene_expr 10
--min_samps_feature_expr 3
--min_feature_expr 10
--min_samps_feature_prop 3
--min_feature_prop 0.1
--fasta /path/to/reference_genome.fa
--gtf /path/to/annotation_file.gtf.gz
-resume
" and I got this error then"The exit status of the task that caused the workflow execution to fail was: 105

Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:ALIGN_STAR:STAR_ALIGN (LC6_S6)'

Caused by:
Process NFCORE_RNASPLICE:RNASPLICE:ALIGN_STAR:STAR_ALIGN (LC6_S6) terminated with an error exit status (105)

Command executed:

STAR
--genomeDir STARIndex
--readFilesIn input1/LC6_S6_trimmed.fq.gz
--runThreadN 12
--outFileNamePrefix LC6_S6.

--sjdbGTFfile Homo_sapiens.GRCh38.104.gtf
--outSAMattrRGline 'ID:LC6_S6' 'SM:LC6_S6'
--quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand gunzip -c --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend

if [ -f LC6_S6.Unmapped.out.mate1 ]; then
mv LC6_S6.Unmapped.out.mate1 LC6_S6.unmapped_1.fastq
gzip LC6_S6.unmapped_1.fastq
fi
if [ -f LC6_S6.Unmapped.out.mate2 ]; then
mv LC6_S6.Unmapped.out.mate2 LC6_S6.unmapped_2.fastq
gzip LC6_S6.unmapped_2.fastq
fi

cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASPLICE:RNASPLICE:ALIGN_STAR:STAR_ALIGN":
star: $(STAR --version | sed -e "s/STAR_//g")
samtools: $(echo $(samtools --version 2>&1) | sed 's/^.samtools //; s/Using.$//')
gawk: $(echo $(gawk --version 2>&1) | sed 's/^.GNU Awk //; s/, .$//')
END_VERSIONS

Command exit status:
105

Command output:
STAR --genomeDir STARIndex --readFilesIn input1/LC6_S6_trimmed.fq.gz --runThreadN 12 --outFileNamePrefix LC6_S6. --sjdbGTFfile Homo_sapiens.GRCh38.104.gtf --outSAMattrRGline ID:LC6_S6 SM:LC6_S6 --quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand gunzip -c --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend
STAR version: 2.7.9a compiled: 2021-05-04T09:43:56-0400 vega:/home/dobin/data/STAR/STARcode/STAR.master/source
May 07 16:08:24 ..... started STAR run
May 07 16:08:24 ..... loading genome

Command error:
INFO: Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
STAR --genomeDir STARIndex --readFilesIn input1/LC6_S6_trimmed.fq.gz --runThreadN 12 --outFileNamePrefix LC6_S6. --sjdbGTFfile Homo_sapiens.GRCh38.104.gtf --outSAMattrRGline ID:LC6_S6 SM:LC6_S6 --quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand gunzip -c --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend
STAR version: 2.7.9a compiled: 2021-05-04T09:43:56-0400 vega:/home/dobin/data/STAR/STARcode/STAR.master/source
May 07 16:08:24 ..... started STAR run
May 07 16:08:24 ..... loading genome

EXITING because of FATAL ERROR: Genome version: 20201 is INCOMPATIBLE with running STAR version: 2.7.9a
SOLUTION: please re-generate genome from scratch with running version of STAR, or with version: 2.7.4a

May 07 16:08:24 ...... FATAL ERROR, exiting

Work dir:
/home/hamalibt/Splicesome_Project/RNA_SEQ_ARGLU1KO_VS_MCF7WT/work/2e/8d479b2a30ee2730d2b11db1c60837

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh"

@jma1991
Copy link
Collaborator

jma1991 commented May 7, 2024

This is another widespread issue. Basically, the index stored on iGenomes is incompatible with the version of the STAR aligner in the nf-core modules. See the somewhat short discussion here. The solution, as mentioned earlier is to provide your own FASTA and GTF file so the workflow generates an index which is compatible.

@BulutHamali
Copy link
Author

BulutHamali commented May 8, 2024

So I downloaded the human genome from a custom repository""https://github.com/ewels/AWS-iGenomes" and used the following Nextflow command with nf-core's rnasplice pipeline "nextflow run nf-core/rnasplice -profile singularity --input <input_path>/samplesheet.csv --contrasts <contrasts_path>/contrastsheet.csv --outdir <output_path>/Master_Directory/MCF7/05062024 --genome GRCh37 --aligner star_salmon --save_reference --dexseq_dtu --min_samps_gene_expr 6 --min_gene_expr 10 --min_samps_feature_expr 3 --min_feature_expr 10 --min_samps_feature_prop 3 --min_feature_prop 0.1 --fasta <fasta_path>/genome.fa --gtf <gtf_path>/genes.gtf
" I even configured nextflow.config for MISO gene extensions, but encountered this error:"Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI (2)'

Caused by:
Process NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI (2) terminated with an error exit status (1)

Command executed:

sashimi_plot --plot-event ENSG00000005302.19 index miso_settings.txt --output-dir sashimi

cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI":
python: $(python --version | sed "s/Python //g")
misopy: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('misopy').version)")
END_VERSIONS

Command exit status:
1

Command output:
(empty)

Command error:
INFO: Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
/usr/local/lib/python2.7/site-packages/matplotlib/cbook/deprecation.py:107: MatplotlibDeprecationWarning: The mpl_toolkits.axes_grid module was deprecated in version 2.1. Use mpl_toolkits.axes_grid1 and mpl_toolkits.axisartist provies the same functionality instead.
warnings.warn(message, mplDeprecation, stacklevel=1)
Traceback (most recent call last):
File "/usr/local/bin/sashimi_plot", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/site-packages/misopy/sashimi_plot/sashimi_plot.py", line 276, in main
plot_label=plot_label)
File "/usr/local/lib/python2.7/site-packages/misopy/sashimi_plot/sashimi_plot.py", line 142, in plot_event
%(event_name, pickle_dir)
Exception: Event ENSG00000005302.19 not found in pickled directory index. Are you sure this is the right directory for the event?

Work dir:
/home/hamalibt/my_refs/work/90/89430a1a23400ef603a22d59c6a7b6

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh"

@jma1991
Copy link
Collaborator

jma1991 commented May 8, 2024

Can you confirm the gene identifier you’re using is present in the annotation file? Just the output of grep would be sufficient.

@BulutHamali
Copy link
Author

I've tried various commands like 'grep', 'grep -m 1', 'grep -F', 'LC_ALL=C grep', './search', and 'awk' to search for 'ENSG00000005302.19' in 'genes.gtf', but after about half an hour, there's still no output. Is this normal?

@jma1991
Copy link
Collaborator

jma1991 commented May 9, 2024

Please try the following command:

grep "ENSG00000005302" genes.gtf

If there is no output, it means your gene identifier is not in the annotation file.

@BulutHamali
Copy link
Author

It worked out afterwards. It seems that the issue with the 'grep' command was likely related to an HPC problem. Thank you.

@torres-HI
Copy link

torres-HI commented Aug 22, 2024

Hi,

I have same error with MISO plotting

I read that the issue is the notation of the reference genome.
My question is what will be the best reference genome for work with mouse.
am used ENSEMBL 38, because is suppose it has all the annotated transcript, at difference of UCSC that is better for epigenomic issues (atac, chip-seq, etc) but not have the transcript annotation.

I need to mach the splicing + rna expression and atac-seq

I'm listening suggestion, thank you

@torres-HI
Copy link

torres-HI commented Aug 22, 2024

Hi again, where I can find the list of the genes with alternative splicing, to add the process?
thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants