-
Notifications
You must be signed in to change notification settings - Fork 35
LoadingOntologies
Ontologies describe the traits that can be characterized (measured, estimated, etc) in a given crop. If an appropriate ontology does not exist for the crop in question, it has to be created. This can be a long and difficult task, as it can be difficult to find a consensus, especially in larger projects.
Recommended tools for creating and editing ontologies are DAGEdit or Protege.
If a suitable ontology is available, it can be loaded into Breedbase. Currently, only loading from the backend is supported for the initial loads. The ontology needs to be available in .obo format, which both DAGEdit and Protege can produce.
The ontology consists of separate types of terms: traits, methods, scales, and variables. In Breedbase, often only the traits and variables are loaded. The traits are an abstract description of the character under consideration, whereas the variable is combination of a trait, method, and a scale. Variables are the only entities that can have associated measurements.
When the -u
option is not used, the name and db name of the ontology will be read from the command line and inserted into the database (-n and -s options).
The dbname is often of the form CO_NNN for CropOntology ontologies. The dbname is used as a prefix for the numeric code of the term, such as GO:0001234
or CO_332:0063636
.
The cvname has to match the cv name in the obo file.
To load the ontology, use the script gmod_load_cvterms.pl
in the Chado
repo at Chado/chado/bin/gmod_load_cvterms.pl
:
perl gmod_load_cvterms.pl -s CO_NNN -n cvname -u -v -H breedbase_db -D breedbase -p password -r postgres -d Pg file.obo
The loaded ontology has to be indexed for certain features, such as the ontology browser, to work correctly:
perl gmod_make_cvtermpath.pl -c cvname -v -D breedbase_db -H breedbase_db -u postgres -p password
The variables can have associated limits, which will be exported to the Fieldbook app. There are two main types of variables, qualitative
and numeric
. Numeric variables can have lower and upper limits, whereas qualitative variables can specify category names.
This metadata can be specified in the an excel file (.xls) file with the following columns:
trait_name trait_format trait_default_value trait_minimum trait_maximum trait_categories trait_details
The loading script is in the sgn
repository, under bin/
:
perl load_trait_props.pl -H breedbase_db -D breedbase -I inputfile.xls
Not all ontologies in the system are displayed in the ontology browser on the website by default.
The ontologies that are supposed to be displayed have to be configured in the sgn_local.conf
file, using the onto_root_namespaces
parameter, for example:
onto_root_namespaces GO (Gene Ontology), PO (Plant Ontology), SO (Sequence Ontology), COMP (Composed Variables)
To create combinations of terms between two parent ontologies, such as combinations of trait variables with time terms, post-composed terms can be defined.
Both parent ontologies have to be loaded into the database.
The post-composed variables can currently be created on the post-composing page at the URL /tools/compose
.
To specify which terms can be post-composed, the cvprop
table in the database has to be populated correctly. For example, in the fixture database, this table contains:
cxgn_fixture=# select * from cvprop;
cvprop_id | cv_id | type_id | value | rank
-----------+-------+---------+-------+------
1 | 58 | 77542 | | 0
2 | 64 | 77541 | | 0
3 | 61 | 77545 | | 0
4 | 59 | 77543 | | 0
5 | 16 | 77540 | | 0
6 | 62 | 77546 | | 0
(6 rows)
cv_id
of 16 is the cassava_trait ontology, and a cv_id
of 62 is the cxgn_time_ontology. The type_id
s refer to entries in the composable_cvtypes
ontology, and specify the type of ontology which defines how they can be combined. For example, 77540 is the trait_ontology, whereas 77546 specifies the time_ontology:
cxgn_fixture=# select cvterm_id, cv_id, name from cvterm where cv_id=63;
cvterm_id | cv_id | name
-----------+-------+-------------------------
77540 | 63 | trait_ontology
77541 | 63 | composed_trait_ontology
77542 | 63 | object_ontology
77543 | 63 | attribute_ontology
77544 | 63 | method_ontology
77545 | 63 | unit_ontology
77546 | 63 | time_ontology