Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iguide_evaluation failed #90

Open
cedarice opened this issue Dec 19, 2024 · 7 comments
Open

iguide_evaluation failed #90

cedarice opened this issue Dec 19, 2024 · 7 comments

Comments

@cedarice
Copy link

cedarice commented Dec 19, 2024

Whether using BLAT or BWA aligner, the following error is reported during the iguide_evaluation step when running evaluate_incorp_data.R R script.

Error in `$<-.data.frame`(`*tmp*`, annotation, value = " ") : 
  replacement has 1 row, data has 0
Calls: $<- -> $<-.data.frame
@cnobles
Copy link
Owner

cnobles commented Dec 19, 2024

Hi @cedarice,

Could you provide more context around the error. Which dataset are you running (simulation or your own)? Does this happen for a single sample or each sample? You can use the "--keep-going" option when you run iguide (snakemake) to keep executing other jobs even if one errors out. I will continue to process until dependencies are not met.

Also have you updated to the recent release (just put it out, v1.1.1). I hadn't ran into this issue in the CI tests or when I was running it locally.

Happy holidays!

@cedarice
Copy link
Author

cedarice commented Dec 20, 2024

Hi @cedarice,

Could you provide more context around the error. Which dataset are you running (simulation or your own)? Does this happen for a single sample or each sample? You can use the "--keep-going" option when you run iguide (snakemake) to keep executing other jobs even if one errors out. I will continue to process until dependencies are not met.

Also have you updated to the recent release (just put it out, v1.1.1). I hadn't ran into this issue in the CI tests or when I was running it locally.

Happy holidays!

Thanks for your kind response. The dataset I'm using consists of single-sample data that belongs to me. The received data has already been demultiplexed, with each sample including R1, R2, I1, and I2 fastq files. Below is my config.yml file.

# Run configuration
Run_Name : "B2M-39"
Sample_Info : "sampleInfo/B2M-39.sampleInfo.csv"
Supplemental_Info : "."
Ref_Genome : "hg38"
Aligner : "bwa"
UMItags : FALSE
Abundance_Method : "Fragment"
# Sequence files
Seq_Path : "fastq_files"
R1: "B2M-39.R1.fastq.gz"
R2: "B2M-39.R2.fastq.gz"
I1: "B2M-39.I1.fastq.gz"
I2: "B2M-39.I2.fastq.gz"
# Sequence information
R1_Leading_Trim : "."
R1_Overreading_Trim : "."
R2_Leading_Trim : "."
R2_Leading_Trim_ODN : "."
R2_Overreading_Trim : "."
# Demultiplexing parameters
skipDemultiplexing : TRUE
# Report
suppFile : FALSE

@cnobles
Copy link
Owner

cnobles commented Dec 20, 2024

Would you mind providing some more information:

  • How many reads are associated with the one sample?
  • Are there different barcodes in your sampleInfo file for the same sample or is it just a single barcode?
  • Do the read names have the standard illumina format or have they been modified?
  • Have you you modified the config file (processing parameters) at all from the simulation.config.yml?
  • Which version of the software are you running (check with iguide version subcommand)?

@cedarice
Copy link
Author

cedarice commented Dec 20, 2024

Would you mind providing some more information:

  • How many reads are associated with the one sample?

1309186 reads

  • Are there different barcodes in your sampleInfo file for the same sample or is it just a single barcode?

Just a single barcode. Bellow is the sampleInfo.csv

sampleName,barcode1,barcode2,gRNA
B2M-39,GCTTGTCA,GTATGTTC,B2M
  • Do the read names have the standard illumina format or have they been modified?

Standard illumina format

  • Have you you modified the config file (processing parameters) at all from the simulation.config.yml?

Above config.yml is the modified fields from simulation.config.yml. In addition, the Target_Sequences and On_Target_Sites fields were modified.

  • Which version of the software are you running (check with iguide version subcommand)?
iguide -v
iguide v1.1.1+4c8a25a

@cedarice
Copy link
Author

Here are the last few lines of the eval.log file.

Number of alignments: 180731

Table of uniquely aligned template counts:
 Type           Counts
 PileUp          622  
 Paired         1843  
 Target_Matched  318  
 Combined       2373  

Total number of alignments: 180731

On / Off target alignment counts:

Off-target 
      2373 
Error in `$<-.data.frame`(`*tmp*`, annotation, value = " ") : 
  replacement has 1 row, data has 0
Calls: $<- -> $<-.data.frame

@cnobles
Copy link
Owner

cnobles commented Dec 20, 2024

I'm not completely sure, but I suspect that the error is occurring due to a mismatch join in the data analysis of the evaluation. This issue is independent of the aligner, and is likely just a technical name issue. By convention (if you want to check out the docs, please look at the user guide for sample info and the three "s"s), using the dash after the name indicates a replicate sample. In your case, the software is reading the specimen as "B2M" and the replicate as "39". When it does its pooling, then it drops the replicate info and looks for specimen "B2M".

Alternatively, it could be looking for the supporting info file ("supp_info") and not having anything leads to a mismatch in an upstream join. Would you mind trying to either add a supp_info file (remember the way you've named it, "B2M" would be the specimen) or see if changing the name would resolve the issue (change "B2M-39" to something like "B2M039")?

Feel free to also send these files to me at [email protected], and I can help with review.

@cedarice
Copy link
Author

Thank you very much for your reply. Unfortunately, however, I still encountered the same error after following your suggestion and changing B2M-39 to B2M39, as well as by providing the supp_info file. I will send the sequencing files to you shortly, could you please run a test for me? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants