You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently trying to analyze BCR sequencing data by using IMGT HiV-QUEST data before using immunarch for further analysis. I should note that I am analyzing BCR data from Rainbow Trout (Oncorhynchus mykiss) and attempting to use the BCR pipeline, so I am not sure if this is causing any of the issues I am encountering. This is my first time reporting a bug, so I apologize if there is too much/not enough information, or if there is anything I missed here! Overall, the package is working beautifully, except for a couple of issues:
First, when using repLoad on the .airr files that were exported from IMGT, I receive a number of errors, copied below. The airr file is directly downloaded from IMGT, with no modifications:
Processing "" ...
-- [1/1] Parsing "/Users/benjamingarcia/Documents/imgt_airr/import/PBS_24.tsv" -- airr
[!] Removed 10553 clonotypes with no nucleotide and amino acid CDR3 sequence.
[!] Warning: found NAs in clonal counts. Setting them to 1's.
== Step 2/3: checking metadata files and merging files... ==
Processing "" ...
-- Metadata file not found; creating a dummy metadata...
The organization of the O. mykiss VDJ loci is different to that of mammals, so it is certainly possible that this is causing some of the issues with the "not logical" values being reported.
The next issue I have encountered arises with repGermline. The code and outcome are listed below:
repGermline(l12_all$data, .species = 'OncorhynchusMykiss')
Error in map2():
ℹ In index: 1.
ℹ With name: PBS_24.
Caused by error in merge_reference_sequences():
! After merging with reference, the data from sample PBS_24 is empty.
There were no valid alleles in the data!
Run rlang::last_trace() to see where the error occurred.
Warning messages:
1: In validate_mandatory_columns(., sample_name) :
437 rows from 5796 in sample PBS_24 were dropped because of missing values in mandatory columns FR1.nt, CDR1.nt, FR2.nt, CDR2.nt, FR3.nt, CDR3.nt, FR4.nt!
2: In merge_reference_sequences(., reference, "V", species, sample_name) :
Genes or alleles Oncmyk_Ar IGHV1D-1201 F, Oncmyk_Ar IGHV11-28-301 F, Oncmyk_Ar IGHV6-3101 ORF, Oncmyk_Ar IGHV8-39-201 F, Oncmyk_Sw IGHV1-1801 F, Oncmyk_Sw IGHV11-2501 F, Oncmyk_Ar IGHV10D-702 F, Oncmyk_Sw IGHV1-4201 F, Oncmyk_Ar IGHV1-2102 F, Oncmyk_Ar IGHV1-39-501 F, Oncmyk_Ar IGHV1-1301 F, Oncmyk_Sw IGHV6D-7601 F, Oncmyk_Ar IGHV8-4601 F, Oncmyk_Ar IGHV3-2002 F, Oncmyk_Sw IGHV1D-7301 F, Oncmyk_Sw IGHV12D-5601 F, Oncmyk_Sw IGHV3D-3001 ORF, Oncmyk_Ar IGHV1D-14-301 F, Oncmyk_Ar IGHV7D-17-102 F, Oncmyk_Ar IGHV1-47-401 F, Oncmyk_Ar IGHV16-3701 F, Oncmyk_Ar IGHV9-1502 F, Oncmyk_Ar IGHV9D-202 F, Oncmyk_Sw IGHV15-4801 P, Oncmyk_Sw IGHV1-2101 P, Oncmyk_Ar IGHV6-402 F, Oncmyk_Sw IGHV16-1401 ORF, Oncmyk_Sw IGHV1-201 F, Oncmyk IGHV8-502 F, Oncmyk_Sw IGHV6D-4001 F, Oncmyk_Ar IGHV8-1902 ORF, Oncmyk_Ar IGHV2-801 F, Oncmyk_Ar IGHV2-2802 F, Oncmyk_Ar IGHV11-47-501 ORF, Oncmyk_Sw IGHV9-2301 F, Oncmyk_Sw IGHV9-1501 F, Oncmyk_Ar IGHV12D-7102 F, Oncmyk_Ar IGHV16-39-30 [... truncated]
I have tried going through and modifying the names to remove the "Oncmyk_Sw " and "Oncmyk_Sw" from the names, as well as everything except for the core portion of the locus name, to no avail.
Thank you for any help you are able to offer, and please let me know if there is any further information/data that you need to help diagnose this!
Ben
The text was updated successfully, but these errors were encountered:
I have imported both formats, including the IMGT-exported AIRR formatted data, and it seems like that may have solved some of the issues. However, I am still running into one main error when loading the data:
[!] Warning: found NAs in clonal counts. Setting them to 1's.
I think the error may initially have been caused by my use of PRESTO as the pre-IMGT cleaning method, which collapses down the counts and puts that into the name. I have removed them from the name, and put them into the AIRR formatted document in the "consensus_count" column, is there another column where I should put these counts so that they can be loaded into the "clones" entry of the immunarch object?
🐛 Bug
I am currently trying to analyze BCR sequencing data by using IMGT HiV-QUEST data before using immunarch for further analysis. I should note that I am analyzing BCR data from Rainbow Trout (Oncorhynchus mykiss) and attempting to use the BCR pipeline, so I am not sure if this is causing any of the issues I am encountering. This is my first time reporting a bug, so I apologize if there is too much/not enough information, or if there is anything I missed here! Overall, the package is working beautifully, except for a couple of issues:
First, when using repLoad on the .airr files that were exported from IMGT, I receive a number of errors, copied below. The airr file is directly downloaded from IMGT, with no modifications:
The organization of the O. mykiss VDJ loci is different to that of mammals, so it is certainly possible that this is causing some of the issues with the "not logical" values being reported.
The next issue I have encountered arises with repGermline. The code and outcome are listed below:
I have tried going through and modifying the names to remove the "Oncmyk_Sw " and "Oncmyk_Sw" from the names, as well as everything except for the core portion of the locus name, to no avail.
Thank you for any help you are able to offer, and please let me know if there is any further information/data that you need to help diagnose this!
Ben
The text was updated successfully, but these errors were encountered: