-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pathovar level #5
Comments
Hi, The lowest taxonomic level the pipeline recognizes currently is the subspecies level. To manually define the input sequences and synonyms you could use the offline option, add the selected input assemblies in the /primerdesign/species_name/genomic_fna directory, synonyms can be passed to the --exception option. However, this will still search specific primers on the species level, because the specificity check uses the species name to differentiate between target and off-target sequences. For the issues in detail:
If the database and the assembly names have the same pattern as Pseudomonas syringae above, I could probably implement this fast because it is the same pattern as for the subspecies level. Alternatively, manual download of sequences and a custom database and adding the pathovar (pv.) keyword could be a solution. |
Hi there, Thanks to your suggestion (and to answer to your questions), I've been trying few things:
Keep you posted with further results, |
Hi,
Since the pipeline was outlined for species specific primers it is not intended to allow primers for subgroups, however this could be implemented at some point. I have an idea to solve the specificity problem, it could work if I am able to implement the pathovar keyword. I will try this and report back. |
Hi,
example clean run:
There can still be problems with sequences in the BLAST database, however at least the target is now defined on the pathovar level. As possible solution then could be to create a custom database as outlined in this example. |
Hi there, Thanks for the new release, I think now is working better than before. I think I may have gotten good primers for pv coryli and pv. avellanae but I have to double check the canditate amplicons/templates on BLAST wgs and refseq, since I used the nt database and not all the sequences are present there. Lastly, going back to the two clusters of Pseudomonas avellanae, I put the fasta belonging to cluster2 inside excludedassemblies/Pseudomonas_avellanae and left the desired ones in /genomic_fna, set the offline mode and no download of new genomes, unfortunately still with the nt database. It should work this way, right? Thanks a lot |
Hi there,
Thanks for this great container, very helpful.
However, I have some troubles at species and pathovars level.
Second issue, pathovar level. I guess the pipeline would also fit at the pathovar level but:
The first pathovar has only 2 genomes incomplete and It doesn't pass the quality control (even if I said to skip it) becuse there is no gff file (obviously).
28 Jul 2020 08:23:49: > Error 1:
28 Jul 2020 08:23:49: > Error: No .gff files found for QualityControl rRNA
The second pathovar has the similar name with the first species and thus, the program encounters this synonim and start the species-level primer design.
Thus, I was wondering if I can somehow define the input target sequence by specifing a path or an accession number (and not only the species) to avoid the 1) and 3) issue.
(Got no solution for the issue 2).
Thanks
The text was updated successfully, but these errors were encountered: