Skip to content

Commit

Permalink
Updated scripts and readme
Browse files Browse the repository at this point in the history
  • Loading branch information
vragh committed Dec 18, 2022
1 parent 853c1bc commit 22c235e
Show file tree
Hide file tree
Showing 81 changed files with 1,113 additions and 20 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,9 @@ done;
21a_rscript_ccsel_main.R - execute via Rscript, inputs must be set within the script (pay heed to the comments), must have 21b_rscript_ccsel_auxfunc.R alongside in the same directory to function properly.
17_rscript_annotscmp_mainjob.R - execute via Rscript, inputs must be set within the script.
23_rscript_seqstats.R and 24_rscript_ccvissum_mainjob.R - execute via Rscript, see comments in script. These two generate the final tables, plots, and figures.
25_of_refcomp_prep.R and 26_of_refcomp_mainjob.R - execute via Rscript, see comments in script. These together run an analysis that generates a Venn diagram comparion the actual OrthoFinder run used to detect clock proteins in the paper with an expanded run (with the same parameters) with additional reference proteomes. These additional reference proteomes are basically all the "reference" quality metazoan proteomes in UniProt. These must be downloaded manually, and then 25_of_refcomp_prep.R can be executed on these. Subsequently, 26_of_refcomp_mainjob.bash can be executed to run OrthoFinder, and then 27_rscript_of_refcomp_mainjob.R to generate the Venn diagram.
25_rscript_of_refcomp_prep.R and 27_rscript_of_refcomp_mainjob.R - execute via Rscript, see comments in script. These together run an analysis that generates a Venn diagram comparion the actual OrthoFinder run used to detect clock proteins in the paper with an expanded run (with the same parameters) with additional reference proteomes. These additional reference proteomes are basically all the "reference" quality metazoan proteomes in UniProt. These must be downloaded manually, and then 25_of_refcomp_prep.R can be executed on these to format the headers and filenames properly for analysis with OrthoFinder. Subsequently, 26_of_refcomp_mainjob.bash can be executed to run OrthoFinder, and then 27_rscript_of_refcomp_mainjob.R can be executed to generate the Venn diagram.
28a_ncbicomp_mainjob_noslurm.bash and 28b_rscript_ncbicomp_mainjob.R - these together take the rscript_ccsel/cc_cand_sel_pub_table.csv file created by 21a_rscript_ccsel_main.R and the files cc_queries.fasta and species_taxids.csv (place these one level above the outputs/ directory within which rscript_ccsel/cc_cand_sel_pub_table.csv should sit; both are available in this GitHub repository under ncbicomp_files/) to search against NCBI's transcriptome and genome assemblies for the species involved in this study to see if any of the proteins for which no candidates were found using our workflow perhaps have matches in the NCBI data. 28a_ncbicomp_mainjob_noslurm.bash uses entrez esearch and fastq-dump (installed in a conda environment called entrez_conda) to get the assemblies, and uses MMseqs2 (installed in a conda environment called mmseqs2_conda) to search against them with cc_queries.fasta which just contains the 10 bait protein sequences used in this study. Note: the bash script here is NOT a Slurm script.
29_rscript_ccseqcomp_mainjob.R - used to take all the clock protein categories wherein a species had more than one candidate and investigate whether these sequences are paralogs or the result of sequence variation. Needs the contents of rscript_ccsel/fas_by_type which is genertated by 21a_rscript_ccsel_main.R. This produces some MSA FASTA files and visualizations (these MSAs and visualizations can be found in this GitHub repository under circadian_clock_candidates/multiple_candidate_msa) as well as a histogram of pairwise sequence identity values (this is supplementary file S10).
```

- As can be seen from some of the script names above (e.g., `22_rscript_ccsel_main.R`), quite a few scripts use the `R` programming language (`v3.6.0` or greater) and concomitant packages. An exhaustive list of packages used in these scripts (and package versions) can be found in the file `r_packages_used.csv` in the `scripts/` directory. Accessing the scripts through `RStudio` should highlight missing packages automatically and present an option to install them automatically; this does not work for the `Bioconductor` packages nor packages from `GitHub` (e.g., `seqvisr`).
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
>Acartia_clausii__PDP1e__TRINITY_DN5258_c0_g1_i1.p1_ORFtype_complete_len_415
--------MDTPPTGRNILQRVGTPAPTK-------QERR----------SKSSSPSPERNMNMKQENPTFLVPPLWKDS
LSLENI-ANDFLDSNVTGEVMKLDDFLKELQSSDVQENM------------EDLCGVQSGQGQGMLAQNQDQQMHDVRNQ
QDPHNRYNL----IRPANIESLRESSDHNPHQQLLQDPHGHQHQQQQQLGHQHHS-----------SRQLQQQQDHHGAG
GISGPVRPPIMHHVGKPPELYNNHNLQENNNNSGIPQASVRPLVRQNLVSPDDHLQLHDDRLS--------PESKTSLLA
SKRKERMRTTDDEDSEDDTRSFNGYTSTVCVNFSQDDLRLATIPGQGQDFDPATRRFSEEELKPQPIIRKRKKQFVPEEL
KNNKYWAKRCKNNEAAKRSREARRLKENQIAMRARFLEEENSALKSEVDNLKKENVDLKQMMVALEEK------------
LNQ---------LIDKR-------------
>Acartia_clausii__PDP1e__TRINITY_DN58787_c0_g1_i2.p1_ORFtype_complete_len_324
----------MVVDVKKLMGRVDCLQQNQWNMDY---DGAMKPLKGKEGNNWNSFPNS---------QTAFLGPQLWEKK
ISMSN-VDQDFK--NWQGYR-GWN---------ETQQNCQ------------NV-----------GSY-----------Q
--D-----QQQQHLQAMEQ-----------GFQTLQQGFQTLHQQQIQEWAFNPPKIEPAKMMDT---------------
----------------SPHSH----HQQSNSSNPSPP-------------------------------------------
----CRMETK----------ND-----DVEFKVSDDDLALAIVPG--AQFDPKTRCFSAEELKPQPIIRKRAKHFTAEND
KDDRYWEKRSKNNIAARRSREARRLKENQIALRAAYLEKQNMHLKVTLKRLNIENAHIKVNVDHLLARIVEKQKSMEAER
INNLNNNLVNNQIINNSNNCSTTNQLQMAN
>Acartia_clausii__PDP1e__TRINITY_DN4264_c0_g1_i4.p1_ORFtype_complete_len_394
MQAEAWTGHTQGMTIRDILEKVDLFNVNVQNSQKGTEKHLSKGQDGKTIIE-TND--P---------SSAYLGPKLWDRQ
ITLDLGLDLDSS--ADEGSISGLS-------CGPRSYGCLSPADGAAGASGIDIGGLSPG-GSGLGSSGMSPVFSKISIE
VEPKERQRSVSSDLQVMNM-----------EEFLAENGLSLD---------MDPNL--PLE-ICAQT-NIQTNTEIRD--
-----VKPS----------LQ----MTRPNVIMNVPKVENREVS---VVTAKRGLEADDDVDDPSQVTVVKRASNDFLYA
ESKRARLEREKAEKK----RKF-----ELELEFDPQDLALATVPG--AQFDPSTRTFDVDELRPQPIIRKRKRIYVPDDN
KDDRYWNNRVKNNVAARRSREARRLKENQIALRAAYLEKENRVIKMELEDVKFDNTKLATERDILKMK-LAKYEKM----
------------------------------
>Acartia_clausii__PDP1e__TRINITY_DN4005_c0_g1_i1.p1_ORFtype_complete_len_327
MEGTVWNNV----TMNELLDRLQ-------PKQPAMETAA--TLKAKLEAEPNIA--P---------ESAYLGPKLWQKP
ISLQE---------------------------------------------------------------------------
--------FNEDDFFVMNI-----------EDFLSENDLDKQHLDRA--MKCDESSPEPDEMRCASPASLSFKSPISPEG
SSMCSERPSFVLS--PTPATP----SPRPGAATPSP----RPGV---IVSTKKEVNTDNDIHQ-----QLGQERNSFLYA
ESKRAKMEREKAEKK----RKL-----EAEMEFAPEDLALATVPG--ADFDPRERSFNIEELKPQPIIRKRSKIYVPDSQ
KDERYWEKRSTNNFAARRSREARRLKENQIALRAAYLERQNNMLKRDLEDTKFENSKLAMERDILKKK-LEKYESMQ---
------------------------------
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
>Acartia_clausii__RORa__TRINITY_DN50346_c0_g1_i1.p1_ORFtype_internal_len_200
-------------------------------PVSATNTNTP-------------------QGQLDPA-AGFVDSTTAFPN
V-GAGSNGRPTPPPSLNIQE---DTIERLQNDPERIRQHLSETVLSAHSLTCLMDNAQIQGAWINQMNPAFLIQFKSKSA
EDLWMTAAQKLTDVITQIIEFAKMLPGFLKFPQEDQIVLLKAGSFELALLRMSRYYCVEKKAVLFMDQLLPMEAFLSTGN
TCEMKLVSQIFEFVR
>Acartia_clausii__RORa__TRINITY_DN52624_c0_g1_i1.p1_ORFtype_internal_len_119
VITCEGCKGFFRRSQSQACVTNYQCPRQKNCVVDRVNRNRCQFCRLQKCLALGMSRDAVKFGRMSKKQREKVEEEVSFHQ
ANQRGQNRQPGNSPDSSLVEPPSSTETLFPSNPQFSQQH-----------------------------------------
--------------------------------------------------------------------------------
---------------
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
>Acartia_tonsa__PDP1e__TRINITY_DN7306_c0_g1_i1.p1_ORFtype_complete_len_412
ME--------TPPLGRNILQRVG-----TPAN-----TKQERKSRSESPSPERNMNMKQENPTFLVPPLWKDSLSLENIA
NDFLDSN-------------------------------------------------VTGEVMKLDDFLKELQSSDVQEDM
-DDLCGVPGGQGMLQQPQMHDVRNQQDPHNRLYNLIRPANIDSLRESNSDHNQHQQLQGPHGHGQQQLPQHHQSSRQLHQ
QQQQDLHGGGGSGGPVRPPIMHHVGKPPELYNNHNLQENNNSGIPQASVRPLVRQNMVSPENHM--LQNDRLD----RLD
RLSPESKKSLA--SKRKERMRTTSEDSEDDSRSFSGYTSTVCVNFSQDDLRLATIPGQVQDFDPATRRFSEEELKPQPII
RKRKKQFVPEELKNNKYWAKRSKNNEAAKRSREARRLKENQIAMRARFLEEENLSLKNEVENLKKENGDLKQMMVALE--
-EKLNQLVDKR
>Acartia_tonsa__PDP1e__TRINITY_DN18110_c0_g1_i4.p1_ORFtype_complete_len_393
MAEAVWNGHTQGMTIRDILEKVDLFNVSVLNSNQKGTEKQHLSKDTDGK---TIIKVNDPSSAYLGPKLWDKQISLDLGL
-D-LDSSADEGAAGPSGSGERNSGDASGYGVSSGMSPGFAGISIQGSPKERARSTSSDLQVMNMEEFLAENGLS---LDI
G-DS--LPINE----------------------VCSGPSVLDY-E------------TSPQGHTNINK------T--L--
-EF----------RDVKPAIT----RPNVIMNVP----------------KVERQVAAAPKRPITAVLDDTEDDCDGEID
DPVPNKKPSNEFLYAESKRARLEREKAE-KKRKF-----ELELEFAPEDLALATVPGA--QFDPSTRTFDVEELRPQPII
RKRKRIYVPDDAKDEKYWNCRIKNNVAARRSREARRLKENQIALRAAYLEKENRVLKQELDDVKFENTKLATERDILKMK
LAKFEQMM---
>Acartia_tonsa__PDP1e__TRINITY_DN7042_c1_g1_i1.p1_ORFtype_complete_len_316
MEETVWNSV----TMKELLDRLEPVKAPVPMNKQSMETAATLKAKLEAE---P---NIAPESAYLGPKLWQKPISLQE--
-----------------------------------------------------FNEDDFFVMNIEDFLTENDMD---RSK
LDRA--MKCNEGS--------------PEPEEMRCTSMANLSL-R-------------SPVGSPQQSAM-----S--P--
-EM----------QAV--PSP----RPGVIVSTK----------------K--------------EVINDV------Q-I
KPVPE---TNSFLYAESKRARLEREKME-KKRKM-----EEEIEFAAEDLALATVPGV--DFDPKQRSFDVEELKPQPII
RKRAKIYVPDERKDDRYWEKRCKNNYAARRSREARRLKENQIALRAAYLERQNNMLKRDLEETKFDNTKLAMERDILKKK
LEKYESML---
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
>Acartia_tonsa__RORa__TRINITY_DN56378_c0_g1_i1.p1_ORFtype_internal_len_106
------------------------------------------------------------------------------IR
QHLSETV--------------------------LSAHTLTCLMDTAQ--IQQAWCNNINPAFLIQFKSKSADDLWMTAAQ
KLTDVITQIIEFAKMLPGFLKFPQEDQIVLLKAGSFELALLRMSRYYCIEKK
>Acartia_tonsa__RORa__TRINITY_DN51721_c0_g1_i1.p1_ORFtype_internal_len_160
CGDKSSGVHYGVITCEGCKGFFRRSQSQACVTNYQCPRQKNCVVDRVNRNRCQFCRLQKCLALGMSRDAVKFGRMSKKQR
EKVEEEVSFHQAAHHRQQRAPGNSPDSSLVEPPSSTETL--FPSNSQFSQQQTYTEFYGSQFPSQFGATSFDEFV-----
---DST---TNFA---------------------------------------
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
>Calanus_helgolandicus__PDP1e__TRINITY_DN4162_c0_g1_i1.p1_ORFtype_complete_len_406
----MDGR-------ESTSRTILQKIGSAPVIRRKESESEETNDFDQEVRATNTQKSK--------MNMKQEGPFLVPQL
WKDSISLENITNDFLES---NVTGEVMNLDDFLKELQV--NELAQQTDELQRHVQTPSHLSQMPHHSP-----PNHRMGH
MQQEDNRFQP-LIRPANIETLKEPSHPVRPSIMHHVGKPPELHMMQSTSHMDQMGDTHVSLPHREGMQNTLSSKDHTGHG
IPQASVRPLVRPVIMSSGKHEQ---EHDLSRLEDSKPPASKRMRNMSEGSAYTDDEDDDGSHHNFGGFTSTVCVNFSQDD
LRLATIPGQEGDFDPATRRFSEEELKPQPIIRKRKKQFVPDELKNNKYWAKRCKNNEAAKRSREARRLKENQIAMRARFL
EEENTALKGEVEHLKKENSDLKQMMMALEEKLNQMTDSR---
>Calanus_helgolandicus__PDP1e__TRINITY_DN254_c0_g1_i1.p1_ORFtype_complete_len_338
----MDGNRNGNGWHGMTIRDILEKVDLSNVSSANEGGSGKLETQKSPV-ITVSSLPKLKPATSVIQTSNPTSAYLGPKL
WEKSITISNLGLEDEEEEEEEPYCDVMNMEEFLAENNIKMDIME----ER---SPESIQTIEVPVYSPSSFLD-------
SPQ-PSPQSPPTS-D--------IKPPQRPSIIMA---PK---------------REHTAETKKEIL-------------
---------------PKGENTFLYAESKRAKLEREK--EERRRK-------------------------LEVELDFAPED
LALATVPGL--DFDPKERAFDMDELRPQPIIKKRQKLFVPDENKDDRYWDKREKNNVAARRSREARRLKENQIALRAAYL
EKENKVLKTELDGSNFDNTKLATERDILKRKLSQYESFALHK
>Calanus_helgolandicus__PDP1e__TRINITY_DN2815_c0_g1_i1.p1_ORFtype_complete_len_327
MAGVVHQSSHPQGWNGMTIKELLDKVDIDGPKPQE-----------------ISSKAKHEPHHS----PQQHSAYLGPKL
WDKPISLQQFQED---------DFFVMNIEEFLAENNLQVGNRSNHGSEA---SPEP---EELSCTSPSAVMYPNHGQGH
SPHTHGPVSPPMASDHRVPIINVPSPAQRPSIIVS---PK---------------DN---SPKRNLL-------------
---------------PKGGNEFLYTESKRAKLEREK--EERRLR-------------------------NQVDIDFAPED
IALATVPGA--DFDPRERAFDVDELRPQPIIRKRPKIFVGEDSKDDRYWAKRCKNNVAARRSREARRLKENQIALRAAYL
ERENTGLKKDLEDANFVNSKLAMERDILKQKLSKYETFPPR-
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
>Calanus_helgolandicus__PER__TRINITY_DN15555_c0_g1_i1.p1_ORFtype_complete_len_818
MLSQDGCTSGSGSRLGSGSREGIEDAYKPPALTEELLVIHNRDMEKNMLTKFKEAKRIGETRFRKDGGKFRAAQESGPKH
VPRDIIKLKEDKRESGRRNHSSKPEVHLKISVPDNVKKDDQRSLAFNDFHWTSSSGMGTPLDINGISQKADHAGNISFPA
SHIIQSTLPGGSYQNFIQVPAVYVPIPHPGSIPPEMLPNQVPIGLTMDGSSQNLIPVHISDPVAGFINPMTKTKVSDSES
DSVRETAIHQSKLPDCLAVNSCMELLQHQDKLQERDTPSSLIKNRHIRIQSRDGSHHTSIKGEPGSALESNASAGASIKE
QHFTTLALKITDNECSSSQSLYSFVHSSDENNRNDFTAASYSSPNGSPEDSKIISPRLVARPLLSEPFWNENVKLDDHLI
YKYQLQQHDIEDVLRKDRAKILKMEVPNLVDEQLMEMLADLDFDYQGIDVLFDEDK--DSEEETTSADESSG-ENDDKSH
RVKMSKDKLYMEKLNIFMEENAPFPNSETNSFNKTTGYFKSDLSNVAYGKTSESSRNLDSFKTFENSQLSGLSYGNPSSS
FKNPESSDTFENSQSCSGVSQNVCTQTIFEKVFENAVEKSSGSRNQHGDNHSFTSNRPQDKRFRYGRNSDSSSAPEPEKN
KGEMDQVSTKTKKGKHKSVKTPVEVICSDNTSSDEQTMKSNEVTASRSIKVLPCERSPHEDNHKTQQGKRFRYGSPQASE
EKDVTQKKKEKKKSESLSKAGFSKTRGQDPRLKQSTDESESSVEYLITSSSVEAGPSSDDQAEKKSEANPDISDMSISNQ
SSVKEGSDNQSEHGSDSNILE
>Calanus_helgolandicus__PER__TRINITY_DN28421_c0_g1_i1.p1_ORFtype_3prime_partial_len_143
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------MTNNLI
FNYQMETQNLGDVLMKDQEALKKMQQPEIVEDQLKELFTELEDGVELEEFLVDFECPPTDEDTETSENEQSGIELEEKIF
R--NRRKKAHLEKMNIFMEAEAPFPMPDSLRLT---------------------------VHKRENSQLYGSPYYSES--
--------TFH---SISGG-------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
---------------------
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
>Centropages_hamatus__CRY1__TRINITY_DN5759_c0_g1_i12.p1_ORFtype_complete_len_533
---MEPVNILWFRNGLRLHDNISLHHAVKDTTAKFLPLFIFDGETPVTKKCQYNKMQFLLECLEDLDEQLREQGSKLYCI
KGSPAEVFRKLSKKLKIEKVCFDQDCEPIWLERDNSVKNFCGSHRIEVIESIGGTLWDPLEIIEANGGTPPLTYAQFCHV
SSGLGSPRRPVDDIDFKTVRFADVSENLMAELGFYTTVPEAHTLGFIKESHHQKVYKGGERRALKYFRRRIVAESDAFVD
GSFLPNRRDPDILCNPKSLSPDLKFGCLSIKTFYWAIQDAFDGVYQGSPPAKTNFCIVSQLIWREFFYAMSTNNPFYGEI
KRNPICIRVPWYKDDQALAIFLAGKTGFPFIDAGIRQLKNEGWIHHTVRNALSMFLTRGDLWLSWEHGLDLFLDYLIDAD
WAVSAGNWMWVSSSAFEKALNSTFDLDPRIYGRRVDAHGEYIKRYIPELKNYPCEYLYDPSSAPLEVQQKAGCIIGQDYP
AAMVDHDKVSIQNRKNMQDLRDELMRKFNEQ-PPHIKPSDENEVKNFFRLEINGEDP
>Centropages_hamatus__CRY1__TRINITY_DN11471_c0_g1_i15.p1_ORFtype_complete_len_528
MEGKEAVNILWFRNGLRLHDNESLHIAASDEKVKVLPLFIWDGETPVTRMSAFNKVEFLVECLQDLDDQLQKVGSNLYCV
RGQPVAVFEKLHKAFKVKKLCFDQDCEPIWLERDNAVKNFCVKKKIKVCESIGATLWNPLKIIEANDGIPPLTYSIFTHV
TEAVGPPRRPQPNLDLSKVNFAKLEDNLQVKLNCFNKVPQPEDLGFTKTT-EKKVYKGGETRALKFFNRRIQNEKEAFLD
GSFLPNIRDPDILNPPKSLSPDLKFGCLSVKTFYWAIMDAFKEVHEGNPPPS--HVIVSQLIWREFFYTMSANNPFYAEI
FRNPICIDVPWGKDEELLNKYKKGETGYPFIDAGLRQLMQEGWTHHVVRNALSMFLTRGDLWLSWEPGFQLFMEKQIDAD
WAVCAGNWMWVSSSAFEQALNVSFSLDPRYYGRRFDPHGKYIKKWLPELKNFPEEYIYTPWTASLEVQEEAGCIIGKDYP
FTIVNHDETVVYNTKMMNKLQQLLMKKYNQQEPEHIKPSGDAEVKNFFRIE------
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
>Centropages_hamatus__PDP1e__TRINITY_DN1673_c0_g1_i8.p1_ORFtype_complete_len_315
MAVQSLWSGLHMKELLDRIEVPTKMESGPIEKQGKVEVEMMPTAESAFLGPKLWDKSISLQQLNEDDFFVMNIDDFLAEN
DLQKDKFGKALKENEKSPEPDEMRCTNLNNLNIKPDSP-S-D--TNEENSMDSILPQAWMEVLNAP-SPRPGVIVSTKDC
KRNMLPKGENTFLYAESKRAKMEREKEEKMRR--------------LEISMDFAPEDLALATVPG--ADFDPKERSFDVE
ELRPQPIIRKRPKIYVPDMEKDDKYWDKRGKNNVAARRSREARRLKENQIALRAAYLERENNVLKRNLEDSSFENSKLVM
ELEILKKKLAKYENTA
>Centropages_hamatus__PDP1e__TRINITY_DN34183_c0_g1_i1.p1_ORFtype_5prime_partial_len_216
--------------------------------------------------------------------------------
-------------------------------NLYPHQHPVQDVQIHHQPPKEHGIPQASVRPLVRPVMPSPEKF-----E
RSNSTDYGPNSE----KRTMSTHSDDEDDMEDGSVEGDEDGFQGGFSSSLVNFSEDDLRLATIPGQGQEFDPSTRRFAEE
ELKPQPIIRKRKKQFVPDELKNNKYWAKRCKNNEAAKRSREARRLKENQIAMRARYLEEENNALKGEMEGLKKENTDLKQ
MMVALEEKLNQLAESR
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
>Centropages_hamatus__TIM__TRINITY_DN22370_c1_g1_i6.p1_ORFtype_complete_len_1277
MNNHVMFGVSDGSTHLGQKVCNKYVIQSGTAATLEIMNRQLNEDDKTLRTYRRALAFSNVIEKDLIPIMLNCNDEPLVFS
SLLKLFVNLTLPIECIFPAGE-NLSTSSKQTVILLNNVLSKCKVLCSNGQFSWVLMGHLSALCQGA--KESKESISIIHN
CLLLIRNVLHIPETVSSMKSGC-GQNQIIWNLFSNNLDKVMLELIGHKSASVWCSTIVHIIAILFKDQYGENMAKLLNS-
--SLAESSTDGESNTSPNDGSPYRNPESDGNSDDIQPTPGQSDAGYPENFQPYPRSRSESVSSLEVFPNASPAQSIISND
DLDSDEKFGKD------------------------QEWGPA-GLPEGANKGPHSKEEGKARDLEKKETERAATTQIRWGD
KPEWSSSGIESIGAW-DQ-----GRSSEGRSEVCAEEE-ARKKPC----------------------WSAYREAR-GAED
KET---KQRENGFG--VNS-----GESSEGEPPEKKSIHENTDRTLHVQFKRNPRPEMQLCHKDKRGPAGPKV------S
DMESVSDSSETMGVPKKTLEDGSKFAVGFVSHISTIDNERQAALEDSTSSNDDDMLKERYFKSRAHLIKPKPRPGVTLSL
KAKSELRRMKLLKQVQENRIRCLSMQSI-IGDKEIAELLKEFTVAFLITGYPKLVGDLK--DMLSEKHSNSLDHSNLLWL
LSYFLQFASQIELGLDQIGSVLSVQTIAFLTFEGVDVVEVLELAA-REPGTDISPHLRRIHLVVTAIREFLQTVNSYSAF
KHLDSSEKAQLEQVQVQIASMKGLRQLLLLLMRTYNPEIQSVQYLADVVVCNHLVVSSMKADNNSSELTLKTHLTQFANT
ELMRQYGRLLVDYRTNSEVVNDCIFTMMHHVAGDLEAPHTLHIPSVLNTFSTIWEEGLDICEDWFDLIEYIIQMFLQAMR
NTPHSCAANIVDNMDSTQVLDGCGMTGNQASQLFWVFSQVENMEDPVGSLIEVYRQTDHIVLSRLAVIQSLLSHGVITHA
MYMNFIYMNSVMPHTHIDQVESLIAEVGSVHNTDGMNTDEETDAREINTESNQLDNHQVNHQLDNSLPEINLNTTPGNNS
QKTNLNAEIKTKMNANNNPREISSQEMSSKGNPQEIEEINVLKECLIKQGQYCLISWVQEVLLDACRVKLNPETLIPEPD
SFTNESVPFYYNHAKQSIPLVPYNRSQYQGLQTESFIFLLHKLGFLLPADVGKAYPRIPYFWSAEQMHEVASKLGPLREG
EMILNSSMKRKLPSVSSEQELNIKMKKESPDVNSTEESSKNEPTQFMDTGLDGMDLTGET-DRQVGVVR----ASWLQFA
KLSV
>Centropages_hamatus__TIM__TRINITY_DN9829_c0_g1_i8.p1_ORFtype_complete_len_1123
MGDHVVFGVGESVAQLGHQIGSRYMIHDQTSAILETMNRRLNEDDTNLWTYRRALAFSNVIEKDLIPILMNSNEDKVVFN
AVVKLLVNLSLQAESILPVHVMNNYPAGKQTIHEINSALAKTKALFTNSQFSLVLMKNLSELSQKIDSKQHEHDIVSISN
CLVLIRNILHIPSKEGTSKSGAGGQNQMIWNLFVQNIDKVILQLISHRSSSVWCTTIVQIIAVLYKDQHVVELERLVKVF
LDNSLESSGDDESNTGGAD-------------------------------QSWPSSR-NSISSLEVFP-ASPSPTIRSRG
DEDETQQWYPDEEIITDDDAEAREISVECPGIGTLVEGGPCEGRPGGEGRNEKSGKVG---------A-------ERSGK
GRSWSNSMDERSGKGRSRGNSMDERSDKGKSGSNSLDERSDKGKSGSNSLDERSDKGKSGSNSMDESFKTMTEAKRIITD
LVTYSGQGRSYVFKDMVSSSSSSDEQETTEEPPEKKLILEDRNKQKKIKFRRKVRTATDLSRKVCKGPSSSIEQQCDNKS
GWGSSQDSNENTGVPKKTNGIDCSTKPGFGP-GKQYDTDHQADIEDSVSSSETNN--YLHWAPRSHPVKPKPRPPALFTP
EERMLHRRQKILKLSHENMVRVKALQNHSAKDKDISELLREFTVTFLIAGYSKLVQDLLTKFMAKEQGRISMDKSHFLWL
VTYFLKFACPLEIGLEEIGSVLSVETIGYIIYEGVEIVETLKLASTRLPRSDTSPELRRMHLVVTAIREYLQVLYNYINI
NN-KTSNKQHLEQLKEQAGQMKALRQMLLLFIRSYNQETQSYQYLADLVVCNSLVLNNMES----STTELRNHMTQFANK
ELMEQYGHLLLDHTQNSVQVNDCVLTMLHHVSGDLEAPHTLYIPSILKTFVALSEEGGALYEDWVDLIHFTIQTFLQRMV
GG-----------------------------------------------EVEADPPTE----------------------
---------------HQTQV-------G---RTAGVEL-------------DTLNN------------------------
------------------------------------NEVHTLTNQIIKEGQGSLITWLQEFFIQVARVKLHPDRLAPAEG
QIL-DPVVFFYNYSQQSIPLIPFSTLQSKGCQTELFNRLLLELGFILPVDSGKVYPRIPHCWSAQHLYSVATQLGPLTDS
QLTVADLTA----------------LKEKTWYSSTSMQPLEESMDMMDVGLTSLELPNDRGQKQVARLQAGSGSAWMLLA
NLSK
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
>Corystes_sp__CYC__TRINITY_DN36011_c0_g1_i1.p1_ORFtype_internal_len_155
PCSPIVLNYLIPMSFPNCPNLFFLSLSFPLFSYHSKLIYRSLIAFPPEQEVKGSSGRMFGLGNFDYPSKYRSECSSIASY
SSDNGSKKRRGSFLDSNDEDADSIKIPRTSGEWSKRQNHSEIEKRRRDKMNTYIMELSSIIPVCTSRKLDKLTVL-----
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
---------------------------------
>Corystes_sp__CYC__TRINITY_DN7745_c1_g2_i1.p1_ORFtype_3prime_partial_len_615
---------------------------------------------------------MFGLGNFDYP-KFRSECSSIASY
SSDNGSKKRRGSFLDSNDDEGDSIKIPRTSGEWNKRQNHSEIEKRRRDKMNTYIMELSSIIPVCTSRKLDKLTVLRMAVQ
HMKMLRGSLNSYTEGHYKPAFLSDDELKNLILQPLQAADGFLFVVGCDRGRILYVSESVYQTLHHTQGELLGTSWFDILH
PKDLTKVKEQLSCSDISRRERLVDAKTLLPVKTDVPQGLTKLCPGSRRAFFCRMRCKSAPVLKEEADSSTGCPKKKSKSQ
SSDKKYSVIHFTGYLKSWAPTKDPLEEDSGSDSESCNLSCLVAVGRVHQPLLSSGAQDAARFGRTVPPQSIDFISKHTSD
GKFVFIDQRASLLLGWLPQELLGSSMYEYFHQDDIPFLADTHRSTLQSSESCNTQVYRFRTKDGSFVRLQSVWWTFKNPW
TKDIEYIISKNSVVTSEAGLVESTMANDSVSQSFNSFNEFLSSPGNCIPDTPPPSNSASSNTNNRLIGGGMHAGKIGRQI
ADEVLDSQRRNDSASNSPVSPFEGILGTGASDRSFATLLRSDMTAHRSNVKNNVLLSSNTSSCSGSDTSRGPQLNSTTTT
TPNNNHHHNNHNHNHHHNNHHHHTNNSNIPASD
Binary file not shown.
Loading

0 comments on commit 22c235e

Please sign in to comment.