Skip to content

Tutorial

Nadja Brait edited this page Jul 18, 2024 · 1 revision

Here, we demonstrate an example run of detectEVE on a small dataset. For this, we assume the tool has already been installed and default databases have been set up (see Installation and database setup)

Make sure that the module is activated and that the environment is up-to-date

mamba activate detectEVE
mamba env update --file workflow/envs/env.yaml

We will be using the following files from the examples folder: ATLV01_cut.fna and Flavi_ref.fasta. The folder and files are downloaded automatically during detectEVE download to path/to/detectEVE/examples.


In this tutorial we provide an EVE screening with two options for viral protein databases:

  • RVDB(80) - provided during database-setup (see Installation and database setup
  • small custom database including Flavivirus sequences - Flavi_ref.fasta from the examples folder

If you would like to try the custom database, we first need to setup a binary DIAMOND database file for Flavi_ref.fasta. If you would like to only try RVDB.dmnd then skip this step:

cd path/to/detectEVE/examples
diamond makedb --in Flavi_ref.fasta --db Flavi
mv Flavi.dmnd ../databases

Next, the config.yaml file needs to be adjusted to include the Flavi.dmnd. Skip this step if you would like to use RVDB.dmnd:

  1. open config.yaml with your favourite text editor and change RVDB.dmnd to Flavi.dmnd
  # NOTE: with custom databases make sure to set: taxonlist: ""  (empty string)
  #db: "rvdb80.dmnd"
  db: "Flavi.dmnd"
  1. If custom database was not build with diamond taxonomy options (as is the case with Flavi.dmnd), change taxonomy settings to "" (empty string):
  taxonlist: ""
  #taxonlist: "--taxonlist 2732396,2731342" # screen only for Orthornaviridae and Monodnaviridae
  # taxonlist: "" <<< use with custom dbs that haven't been built with a diamond taxonomy !

We now conduct an EVE search against our target viral database using the example genome assembly file ATLV01_cut.fna:

cd /path/to/detectEVE
./detectEVE examples/ATLV01_cut.fna

detectEVE will create an output folder detectEVE-(time) if not otherwise specified. If --cores is not specified, detectEVE will take all availabe. Here examples for additional parameters:

# examples
./detectEVE -o test examples/ATLV01_cut.fna # save output in folder 'test'
./detectEVE --snake '--cores 8' examples/ATLV01_cut.fna # use a maximum of 8 threads 

# see help page for more options
./detectEVE -h``

After some time, you will receive your output files with the following EVE hits:

ATLV01_cut-validatEVEs.tsv after RVDB.dmnd search:

eve_id  confidence      eve_score       suggests        because locus   top_evalue      top_pident      top_desc        top_viral_desc  top_viral_lineage       max_count_phylum
ATLV01_cut_EVE001       high    96      viral (19), maybe-viral (1)     VDB (13), UDB Viruses (6), uncharacterized protein (1)  ATLV01019207.1_3580-4280:-      3.01e-73        54.7    polyprotein [Karumba virus] acc=YP_009388577.1  polyprotein [Karumba virus] acc=YP_009388577.1  k__Viruses;K__Orthornavirae;p__Kitrinoviricota;c__Flasuviricetes;o__Amarillovirales;f__Flaviviridae;g__unclassified Flaviviridae genus;s__Karumba virus Kitrinoviricota
ATLV01_cut_EVE002       high    95      viral (19), maybe-viral (1)     VDB (13), UDB Viruses (5), glycoprotein protein (1), viral (1)  ATLV01019207.1_2615-3409:-      1.8099999999999996e-112 60.5    putative glycoprotein [Anopheles darlingi virus] acc=QBK47202.1         putative glycoprotein [Anopheles darlingi virus] acc=QBK47202.1         k__Viruses;K__Orthornavirae;p__Negarnaviricota;c__Monjiviricetes;o__Mononegavirales;f__Xinmoviridae;g__Madalivirus;s__Madalivirus amazonaense   Negarnaviricota
ATLV01_cut_EVE003       high    94      viral (15), maybe-viral (1)     VDB (12), UDB Viruses (3), glycoprotein protein (1)     ATLV01019207.1_1147-1368:-      3.46e-18        48.6    putative glycoprotein [Gambie virus] acc=AOR51379.1     putative glycoprotein [Gambie virus] acc=AOR51379.1     k__Viruses;K__Orthornavirae;p__Negarnaviricota;c__Monjiviricetes;o__Mononegavirales;f__Xinmoviridae;g__Gambievirus;s__Gambievirus senegalense   Negarnaviricota

ATLV01_cut-validatEVEs.tsv after Flavi.dmnd search:

eve_id  confidence      eve_score       suggests        because locus   top_evalue      top_pident      top_desc        top_viral_desc  top_viral_lineage       max_count_phylum
ATLV01_cut_EVE001       high    96      viral (19), maybe-viral (1)     VDB (13), UDB Viruses (6), uncharacterized protein (1)  ATLV01019207.1_3580-4280:-      3.1e-76 54.7    polyprotein [Karumba virus]     Genome polyprotein (Fragment) n=3 Tax=unclassified Flaviviridae  RepID=A0A1C9U5I9_9FLAV k__Viruses;K__Orthornavirae;p__Kitrinoviricota;c__Flasuviricetes;o__Amarillovirales;f__Flaviviridae;g__unclassified Flaviviridae genus;s__unclassified Flaviviridae species     unclassified root phylum
Clone this wiki locally