-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LivingAtlas: Additional fields for SpeciesListPipeline
(ARGA)
#865
base: dev
Are you sure you want to change the base?
Conversation
Update to GBIF latest dev
Initial commit for adding newfield `presentInCountry` set via a species list.
Refactored code to remove references to trait name in IndexRecordTransform, so only need to add new traits to IndexFields and managed-schema going forward.
Changed error to warning and a few other minor changes
ARGA-94 - ARGA list-based fields
import java.util.ArrayList; | ||
import java.util.Iterator; | ||
import java.util.List; | ||
import java.util.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed this - IDE did this and it might break coding rules?
@@ -73,4 +74,7 @@ public interface IndexFields { | |||
String GGBN_TERMS_LOAN = "http://data.ggbn.org/schemas/ggbn/terms/Loan"; | |||
String LOAN_DESTINATION_TERM = "http://data.ggbn.org/schemas/ggbn/terms/loanDestination"; | |||
String LOAN_IDENTIFIER_TERM = "http://data.ggbn.org/schemas/ggbn/terms/loanIdentifier"; | |||
String AUS_TRAITS_FIRE_RESPONSE = "fire_response"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convert from snake case to camel
@@ -51,6 +51,7 @@ public interface IndexFields { | |||
String POINT_0_02 = "point-0.02"; | |||
String POINT_0_1 = "point-0.1"; | |||
String POINT_1 = "point-1"; | |||
String PRESENT_IN_COUNTRY = "presentInCountry"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering why we need this. Cant the data just provide countryCode ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its the field name, so data will look like presentInCountry:Australia
or presentInCountry:Italy
. Could use country code I suppose presentInCountry:AU
but data is from region
field in species list, which uses full name, so would require an additional lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed the field to taxonPresentInCountry
now.
Based on feedback from Dave M. This avoids confusion with the DwC term `country`.
Pls assign to @djtfmartin.
PR contains changes for additional fields to be processed during the
specieslist
phase of theindex
pipeline. There are two set of fields that relate to:locatedInCountry
a string field contains a single value, the (ISO) country name, populated via a species list. The idea being that the taxon is known to be located in that country. ARGA uses this for the large percentage of records that have no location data. Config varincludePresentInCountry
has a defaultfalse
value.COMMON_TRAIT
and contain additional columns:traitName
andtraitValue
. Traits defined inIndexedFields.java
and (SOLR)schema.xml
will be indexed as multi-value but any additional lists (that are added without changes to those files) will still be indexed as dynamic fields. Config varincludeTraits
has a defaultfalse
value.