-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:EDGER_DEU:SUBREAD_FEATURECOUNTS (SRX12134688)', Caused by: Process NFCORE_RNASPLICE:RNASPLICE:EDGER_DEU:SUBREAD_FEATURECOUNTS (SRX12134688)
terminated with an error exit status (255)
#151
Comments
Hi @tud03125! I was wondering if you could share where you downloaded the reference genome and annotation from? Direct links to the exact downloads would be super helpful. I suspect this might be tied to this issue. Alternatively, I noticed you’ve set the gtf_group_features parameter to 'gene_biotype'. By default, it's set to gene_id, so I was curious—what's the reason for aggregating by biotype instead? Thanks so much! |
BTW, we are discussing this on Slack and did some troubleshooting, see here. I don't have this problem, so I'm curious where this comes from. |
The "annotation.saf" one? That was the nf-core/rnasplice pipeline creating it. The reference genomes were my downloaded files. I didn't use "--genome" parameter for it (someone suggested that that's old). I'll try pulling out all the references here: fasta, gtf, salmon_index, star_index, and ribo_database_manifest. --> Btw, the Salmon index is now "new_salmon_index." The "salmon_index" was old/outdated that's not compatible with newer versions. So, it should be "-salmon_index /home/tud03125/new_salmon_index " in the scripts. --> Actually, I'm having trouble attaching FASTA and GTF files here. The error message's "We don’t support that file type. Try again with GIF, JPEG, JPG, MOV, MP4, PNG, SVG, WEBM, CPUPROFILE, CSV, DMP, DOCX, FODG, FODP, FODS, FODT, GZ, JSON, JSONC, LOG, MD, ODF, ODG, ODP, ODS, ODT, PATCH, PDF, PPTX, TGZ, TXT, XLS, XLSX or ZIP." Any way on how you guys upload it? Or instead, what I did to get these files is use these command lines:
--> Also, "new_salmon_index" and "STAR_index" are very large and I'm getting this error message. "File size too big: 25 MB are allowed, 156 MB were attempted to upload." So, just in case, I use these scripts to build these indexes: ----> STAR --runThreadN 16 ----> salmon index -t /home/tud03125/Homo_sapiens.GRCh38.cdna.all.fa -i /home/tud03125/new_salmon_index Curious question: can it have something to do with "--contrasts?" Reason for asking is that I'm doing nf-core/rnasplice as a practice/test run on normal, human samples. So, there's no "treatment" involved, though they're in different labs. I've ran into problems where contrast is needed for differentiation, such as SUPPA and DESeq2. So, instead of "NORMAL_NORMAL" as contrast, I put "NORMAL_Total_RNA_seq_NORMAL_polyA_RNA" as contrast (with "NORMAL_Total_RNA_seq" as treatment and "NORMAL_polyA_RNA" as control). The new contrast sheet would be "/home/tud03125/pipeline/contrast_NORMAL_2.csv" instead. But, for this "featureCounts" issue, I've used the prior contrast one: "NORMAL_NORMAL." Can contrast affect "featureCounts?" And, about the "gene_biotype," that's a good question, and even Slack guys asked me that. I honestly don't know. I've never made that. This was picked up or made by the nf-core/rnasplice pipeline. Anyway, the information about my references are here (one's downloaded link, and the other's directions since I can't literally upload those links here). The human liver samples are many. I can give a few. Do you need some? BTW, this is "Michael Levin," the same guy in the Slack channel. |
*And, the script for Homo_sapiens.GRCh38.cdna.all.fa transcriptome file:
|
Hi @tud03125 , Could you please share your nextflow.log and nextflow.config files from your latest run? I have a feeling the issue might be related to trying to aggregate by gene_biotype, which doesn't seem quite right in this context. Also, just a side note—this pipeline doesn't have a --ribo_database_manifest parameter, so I'm not sure why that's being used. Thanks so much! |
One, other thing: when I bypass featureCounts by including "-F SAF" line inside ".../subreads/featurecounts.main.nf" file, this new error pops up:
For me to bypass this one, I had to change "bam" -> "bam1" and "cond" -> "cond1" inside "...modules/local/create_bamlist_single.nf" script also. |
@jma1991 I'm attaching nextflow.config here (as a ZIP folder since this GitHub doesn't accept CONFIG files): About "nextflow.log," I've attached that in the very first message. You mean this one? And, regarding "--ribo_database_manifest," I was facing some sort of an error before initiating the nf-core/rnasplice run. Not sure what it is, though (it's a while ago). It was related to "--multiqc_methods_description" parameter that I was getting an error when using it, and that when I removed it, no issues. It might be related to that. The thing is that when running nf-core/rnasplice, I often get these warning messages, including "--ribo_database_manifest" if I don't include it:
|
Thanks for your response! Could you please also share your nextflow_schema.json file? It seems like there might be a mix-up between the rnaseq and rnasplice pipelines in your setup. If you take a look just below the nf-core graphic in your reply, you'll notice it mentions the pipeline as nf-core/rnaseq. This mix-up could be the cause of your errors. |
@jma1991 This one? By then, could you confirm if it is actually the mix-up between rnasplice and rnaseq pipelines? The thing is that I did use nf-core/rnaseq pipeline inside my Temple HPC before venturing into nf-core/rnasplice one. If it is that, would that mean I have to uninstall both rnaseq and rnasplice, and then install rnasplice in order to run it fresh and without any potential bugs? |
I’m not familiar with your HPC setup, but it seems like that might be part of the issue here. Your nextflow_schema.json file looks good, but some things stand out: your nextflow.log file is showing the rnaseq title, the feature counts tool is defaulting to "gene_biotype" from the rnaseq schema, and it’s warning you about parameters that belong to rnaseq rather than rnasplice. These all suggest there might be something unusual with how the workflows are installed, set up, or executed on your HPC. I’d recommend uninstalling both workflows and then running the rnasplice test suite on your HPC to see if that resolves the issue. You can do this with the following command:
Give this a try, and let me know how it goes! |
@jma1991 Just in case I don't mess it up, how to I properly "uninstall both workflows?" And, the "running the rnasplice test suite on your HPC," that's where your code (below) comes in?
|
I’m not too familiar with running Nextflow in an HPC environment, especially when it comes to your specific setup. I’d recommend reaching out to your IT department for guidance—they’ll have the best insights to help you out! |
@jma1991 Just deleted nf-core/rnasplice, nf-core/rnaseq, nextflow, and everything with it. Started clean! Then, re-installed nextflow and the nf-core/rnasplice within it, and then attempted this test you've suggested:
So far, it's currently running (not finished). But so far, what you see here, what do you think?
|
That looks good! Can’t see any warnings so far 👍 |
@jma1991 The test just got completed:
Now, I'll re-run the PBS with this, and see how it goes. |
@jma1991 Right now, the PBS script is about to be finished, but with an error message. So far, inside the execution_trace text message is this:
I've had this problem before. But, after hearing that rnasplice was mixed with rnaseq, I deleted everything. But instead, I tried test runs. The very first one was from the last message. It was using this:
The next test was this one (also successful):
But then, it was this test run that replicates the error:
And, the specific error message is this:
Just in case, here's the .nextflow.log for this test run that reproduces the MISO_SASHIMI error: I've done various test runs, like what would happen if I take off "--fasta," "--gtf," both indexes, and a combination of parameters. I can post them here if you're interested. For gene IDs like "ENSG00000004961" or "ENSG00000005302," they are present. I'm attaching here snapshots of "ENSG00000004961" inside "genes_to_filenames.shelves," inside miso's index, and inside work directory. But otherwise, I honestly don't know what's going on that's producing MISO_SASHIMI error. It's very similar to this report: #150. But, the difference is 1) I'm using humans (Homo Sapiens) and 2) they're downloaded files from "ftp://ftp.ensembl.org/pub/release-108/" website (same ones as above messages). |
Thank you so much for the detailed response! I noticed in your snapshot document that the gene identifiers include the Ensembl version suffix (e.g., ENSG00000004961.1 instead of just ENSG00000004961). Could you please rerun your last test and provide the results for these specific identifiers: ENSG00000004961.15, ENSG00000005022.6, and ENSG00000005302.19? Thanks again! |
Hi there! I did some testing on my machine, and I believe I’ve found a solution for the issue. If you're working with a gene annotation that includes a gene version field, like this:
MISO will append the gene version to the end of the gene identifier when creating the MISO index files. So, you'll need to include the version identifier in your nextflow.config file, like this:
To test this, I downloaded the GRCh38 reference genome and its annotation from the Ensembl FTP server. I then ran the workflow using these for the fasta and gtf parameters, and it worked as expected! |
@jma1991 Can you tell me what's the right way of modifying the nextflow.config file? Reason for asking is that I was told that I shouldn't actually modify the config file itself, but by using "-c." So if I'm to use a different nextflow.config file, use "-c nextflow_humanliver.config" or something like that? And, regarding your other question: "Could you please rerun your last test and provide the results for these specific identifiers: ENSG00000004961.15, ENSG00000005022.6, and ENSG00000005302.19?," like adding "--miso_genes ENSG00000004961.15, ENSG00000005022.6, and ENSG00000005302.19" parameter into my script? |
You have a lot of flexibility when it comes to configuring the pipeline! For more detailed guidance, check out this link. Personally, I find it easiest to download the workflow repository, open the nextflow.config file in a text editor, make my changes, and run it from there. But if you only need to tweak a parameter or two, you can simply pass them directly on the command line (e.g., --fasta genome.fa). |
@jma1991 I see. I also have this idea: so, I ran this code:
Based on that, I can just add this one parameter into my script:
Right? |
The
|
Very valuable lessons I'm learning here! Also, before your recent message, I made a test run (though on a shell script):
And, as it turns out, the run's successful. I'll put the nextflow.log here. But also, I'll try that run on my PBS script.
|
@jma1991 One curious question: when running nf-core/rnaseq, this ENSG00000004961 and ENSG00000004961.15 issue wasn't a thing then. Never an error about gene versions. But, why in nf-core/rnasplice? |
I think it’s because, as far as I remember, the rnaseq workflow doesn’t require users to provide a list of genes for downstream analysis. If it did, it would need to ensure that the user enters the correct identifiers. In the rnasplice workflow, the user’s GTF file must be converted to GFF3 for MISO indexing. During this conversion from GTF to GFF3, the gene_id and gene_version values are combined. If you're satisfied with the solution, would you like me to go ahead and close the issue? |
Lets wait until 2-3 PM EST. Right now, the PBS pipeline is still running. But so far, things are looking good, esp. for "MISO_SASHIMI" (last parts of the trace text file below):
I want to check all the way to the end, particularly with seeing the results of "NFCORE_RNASPLICE:RNASPLICE:DEXSEQ_DEU:DEXSEQ_EXON." That one was once labeled "FAILED" when I was already having issues with "MISO_SASHIMI." But, after that last change and then with that last test run, that issue never came up. So, I want to check if it's the same for this real, PBS run, if that's alright with you. |
Sorry. It's still running. But so far, I'm not seeing any issues so far. It's just that the pipeline is not completed, successfully or unsuccessfully, yet. |
@jma1991 OK. My PBS just finished running. But, I'm getting this error message instead:
It's funny since the test run's fine. But, why this then? Just in case, here's the .nextflow.log here: |
The relevant part of the error is
Each process in Nextflow can be assigned resources (threads, memory, and time) dynamically. Your process reached the dynamically allocated time, so it threw an error. To fix this, you can either:
Give one of those solutions a try and let me know how you get on. |
@jma1991 Got it. I'll work on it. One question: so, this is just my test run, not my real experiment. Just in case, here's my 1) input CSV sheet, 2) contrast CSV sheet, and 3) the word document details about the samples: |
@jma1991 after your suggestion on extending DEXSEQ_EXON process. It was successful!
I'll now re-run it on my real experiments. But otherwise, I think this is good to close this issue. Would that be alright? But also, two other questions:
|
Hi @tud03125 I'm glad to hear your issue has been resolved! I'll go ahead and close it. As for your questions:
|
Description of the bug
I'm facing this error. Can anybody help:
Command used and terminal output
Relevant files
.nextflow.log.zip
System information
Nextflow version 24.04.04
HPC at Temple University, College of Science and Technology department
Either PBS job or just Shell bash
Singularity
Temple HPC's linux. I connect using Ubuntu (a Windows Subsystems for Linux)
Using nf-core/rnasplice version 1.0.4.
The text was updated successfully, but these errors were encountered: