Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running with local files #7

Open
pedroleao2 opened this issue Jul 12, 2022 · 0 comments
Open

Error running with local files #7

pedroleao2 opened this issue Jul 12, 2022 · 0 comments

Comments

@pedroleao2
Copy link

Hi.
I'm trying to run FlaGs using local genomes, and I have some trouble creating the files.
I was reading the paper and I have a straight forward question. Is it possible to create our own database with sequences not available in NCBI (by generating the gff3, and the fasta files),a nd also use as input sequences that are not in NCBI (don't have an NCBI accession number)? Someone have done this?

I try, and fail.
This is the error that I am receiving:
The submitted query might include characters not found in NCBI protein accessions eg. > , # , ! etc. Please provide correct format, Thanks!

Here is the example of one of the fasta files created with my personal sequences:

>Meg22_1618_scaffold_2kb_scaffold_1561:53726-55866_1
MKTKSDWLNKIHQGDALEVLKQMPDNFVDCIITSPPYWGLRYYGESTFKVWDGDPNCEHE
WQFQEGMRYRGGTKNSIGNFKDHLHFTQRFAFCKKCGAWYGQLGLEPTLEMYIDHLLQIT
AELKRVLKDTGVMYWNHGDCYGGSNCGRYDWRETASISRSELYRYKPSPQSKLKPKCLAL
QNYRLILRMIDEQDWILRNIVIWYKPNHLPDSVKDRFTRAYEPIFMLVKNKKYWFDLDAV
RVEYESDTMIELLNGHNSEEFIIGKNPGDLWTIPVQPFKDAHFATFPPRLIEPMIKSSCP
RWVCKKCGKPRERIIERTKVIKQSEPKPYTADTEFITHGTYDSTLHAVAIRKCIGWTDCG
CNAGWEAGIVLDPFMGSGTVAIVAQRLGRNWIGIELNPDYIEIANKRLEQEFGLFHNK
>Meg22_1618_scaffold_2kb_scaffold_1561:53726-55866_2
MMINYGDLDNPYTWRELRSKSRGLRYFEIITIKLYNNGRRTMINNEDKVRLAKQDSSR
>Meg22_1618_scaffold_2kb_scaffold_1561:53726-55866_3
MYMLRHVIREELARLMYNELTKENESRKEFEELNEFELTDYYNLADRILVIFDKYGVRI
>Meg22_1618_scaffold_2kb_scaffold_1561:53726-55866_4

And one example of the gff file edit from the Prodigal output:

##gff-version  3
# Sequence Data: seqnum=1;seqlen=2140;seqhdr="Meg22_1618_scaffold_2kb_scaffold_1561:53726-55866"
# Model Data: version=Prodigal.v2.6.3;run_type=Metagenomic;model="11|Candidatus_Amoebophilus_asiaticus_5a2|B|35.0|11|0";gc_cont=35.00;transl_table=11;uses_sd=0
Meg22_1618_scaffold_2kb_scaffold_1561:53726-55866       Prodigal_v2.6.3 CDS     3       1256    113.9   -       0       ID=Meg22_1618_scaffold_2kb_scaffold_1561_53726-55866_1;partial=10;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.381;conf=100.00;score=113.89;cscore=120.19;sscore=-6.31;rscore=-4.87;uscore=-4.91;tscore=3.47;
Meg22_1618_scaffold_2kb_scaffold_1561:53726-55866       Prodigal_v2.6.3 CDS     1213    1389    6.4     -       0       ID=Meg22_1618_scaffold_2kb_scaffold_1561_5372

Thank you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant