Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in SUPPA: Clustergroups are assigned incorrectly #131

Closed
spraeger opened this issue Apr 19, 2024 · 4 comments
Closed

Error in SUPPA: Clustergroups are assigned incorrectly #131

spraeger opened this issue Apr 19, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@spraeger
Copy link

Description of the bug

Hi,
For one of my contrasts (pdx_41-control_don) the group parameter for CLUSTEREVENTS_IOI (suppa_clusterevents.nf) is assigned incorrectly, so that ERROR:lib.cluster_tools:Invalid index. Index 6 is smaller than the number of columns in the file (7). occurs. For all other contrasts, the clustergroup assignment works fine.

The related pdx_41-control_don_transcript_diffsplice.psivec file contains seven columns, the correct grouping would be --groups 1-4,5-7:

transcript_pdx_41_1     transcript_pdx_41_2     transcript_pdx_41_3     transcript_pdx_41_4     transcript_control_don_1        transcript_control_don_2        transcript_control_don_3
ENSG00000290825.1;ENST00000456328.2     1.0     1.0     1.0     1.0     nan     nan     nan

It seems that the derivation of the clustergroups for this contrast was never started. The related work directory 0d/4f97644516640eda4e35d88e4dab59 is empty.

(base) -bash-4.2$ grep pdx_41-control_don .nextflow.log | grep CLUSTERGROUPS
~> TaskHandler[jobId: null; id: 164; name: NFCORE_RNASPLICE:RNASPLICE:SUPPA_SALMON:CLUSTERGROUPS_IOI (pdx_41-control_don); status: NEW; exit: -; error: -; workDir: XXX/rnasplice_pdx/work/0d/4f97644516640eda4e35d88e4dab59 started: -; exited: -; ]
activation/XXX/rnasplice_pdx/work/0d/4f97644516640eda4e35d88e4dab59 started: -; exited: -; ]

However, CLUSTEREVENTS_IOI is executed with --groups 1-3,4-6 which causes the error to occur. Could you please help me to understand why and at which point the assignment --groups 1-3,4-6 is made, as CLUSTERGROUPS does not seem to run? Thank you in advance!

Command used and terminal output

Command used:
nextflow run \
$(RNASPLICE_DIR) \
--input config/samplesheet_pdx_group.csv \
--contrasts config/contrastsheet_pdx_group.csv \
--outdir workspace/rnasplice_pdx_group_results \
-c config/XXX.config \
--fasta $(GENOMEDIR)/GRCh38.primary_assembly.genome.fa \
--gtf $(GENOMEDIR)/gencode.v43.annotation.gtf \
--star_index $(GENOMEDIR)/genome/index/star \
--salmon_index $(GENOMEDIR)/genome/index/salmon \
--gencode \
--save_reference \
--save_unaligned \
--min_samps_gene_expr 0 \
--min_samps_feature_expr 0 \
--min_samps_feature_prop 0 \
--min_feature_expr 0 \
--min_feature_prop 0 \
--min_gene_expr 0 \
--miso_genes "ENSG00000211899.10, ENSG00000171862.14, ENSG00000004961.15, ENSG00000005302.19, ENSG00000147403.18"


Output:
-[nf-core/rnasplice] Pipeline completed with errors-                                                      [38/1751]

ERROR ~ Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:SUPPA_SALMON:CLUSTEREVENTS_IOI (pdx_41-control_don)'

Caused by:
  Process `NFCORE_RNASPLICE:RNASPLICE:SUPPA_SALMON:CLUSTEREVENTS_IOI (pdx_41-control_don)` terminated with an erro$
 exit status (1)

Command executed:

  suppa.py \
      clusterEvents \
      --dpsi pdx_41-control_don_transcript_diffsplice.dpsi \
      --psivec pdx_41-control_don_transcript_diffsplice.psivec \
      --dpsi-threshold 0.05 \
      --eps 0.05 \
      --metric euclidean \
      --min-pts 20 \
      --groups 1-3,4-6 \
      --clustering DBSCAN \
       -o pdx_41-control_don_transcript_cluster
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASPLICE:RNASPLICE:SUPPA_SALMON:CLUSTEREVENTS_IOI":
      suppa: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('suppa').version)")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is prefer
red
  INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
  WARNING: Skipping mount /var/apptainer/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in con
tainer
  ERROR:lib.cluster_tools:Invalid index. Index 6 is smaller than the number of columns in the file (7).

Work dir:
 XXX/rnasplice_pdx/work/df/888c5d2dcf001bf157d5ddbaffafbb

Relevant files

contrastsheet_pdx_group.csv
samplesheet_pdx_group.csv

System information

CentOS Linux release 7.9.2009 (Core), LSF Cluster
Nextflow version 23.10.1
$(RNASPLICE_DIR) in the pipeline call refers to a fork of nf-core/rnasplice v1.0.2 that increases alignment resources (https://github.com/dkoppstein/rnasplice/tree/increase_sam)

@spraeger spraeger added the bug Something isn't working label Apr 19, 2024
@jma1991
Copy link
Collaborator

jma1991 commented Apr 20, 2024

Hey @spraeger

Thanks for reporting your issue. The error arises because the channel which contains the groups parameter is out of sync with the channels which feed the dpsi and psivec parameters. This occurs because Nextflow processes are not guaranteed to return results in the order they arrive from the input channel. This is easily overlooked, and I can only apologise. I've prototyped a solution and will try and get it posted tomorrow as a hot fix for you to test with your data.

James

@jma1991
Copy link
Collaborator

jma1991 commented Apr 21, 2024

Hello @spraeger,

I've submitted a pull request with the proposed fix. Could you please test it and let me know if it resolves the issue for you? I'll need to hold off on merging and releasing it until it undergoes a second code review. Thanks!

@spraeger
Copy link
Author

Hi @jma1991,

Thanks a lot for clarification and the prompt action! I have tested your patch and it resolves the issue for my data.

@jma1991
Copy link
Collaborator

jma1991 commented May 9, 2024

Hey @spraeger

We have just released 1.0.4 which has this issued fixed. Thanks for bringing it to our attention.

@jma1991 jma1991 closed this as completed May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants