-
Notifications
You must be signed in to change notification settings - Fork 149
Indexing and Searching the Time Period of the Content
The Geoportal Server supports indexing and search the time period of data content described in a metadata document as of version 1.2.4. There are many ways to describe the time period of a data resource; for example, does the data resource have one continuous time range, or is it a set of observations with many time ranges? Or is it a single date? Some metadata standards allow general descriptions to characterize time, such as "Present" or "Now". These have been accommodated in the Geoportal Time Period of Content search.
Your geoportal may or may not have a user interface that supports querying of time content; however, you can still search the time period of content through the geoportal's REST API. This topic will describe how to execute time-based queries in the Geoportal Search page and also the backend details of how time period search is supported in the geoportal web application.
The index in the geoportal that can be used for querying time period is called "timeperiod". Datetime strings are ISO 8601. If a time zone is not supplied, the JVM default for the web server is used (this applies to documents that are indexed as well).
Bracket characters - are used for an "intersecting" timeframe. A document intersects when any of its intervals intersect the query interval. For example, if you want to discover data resources that cover a specific period of time, but its acceptable that the time period is overlapping. The period of a data source could start in 2001 and end in 2005, but your search criteria may be from the year 2002 through 2004 - the record would be returned.
Curly brace characters are used for a "fully within" timeframe. A document is fully within when the document's minimum lower boundary and maximum upper boundary are within the query interval. For example, if you want to discover data resources that cover a specific period of time, and all resources must have taken place fully within the time period. The period of a data source could start in 2007 and end in 2011, and if your search criteria is from 2006 through 2009, the record won't be returned because it is not fully within the 2006-2009 timeframe.
Example queries:
This query is an intersecting timeframe, from the first millisecond of 2008 to the last millisecond of 2012, as no specific days/months are defined. Also, the Z is for Zulu time:
timeperiod:[2008Z TO 2012Z]
This query is intersecting, from the first millisecond of June 2008 to the last millisecond of July 2012:
timeperiod:[2008-06 TO 2012-07]
This query is intersecting, from the first millisecond of 2008 to the present moment:
timeperiod:[2008 TO *]
This query is intersecting, with no specific beginning time period but ending the last millisecond of 2012:
timeperiod:[* TO 2012]
This query is intersecting, from the first millisecond of 2008 to a very specific millisecond on July 1, 2012:
timeperiod:[2008 TO 2012-07-01T01:00:00-08:00]
This query is for records that are fully within the first millisecond of the year 2000 to the last millisecond of the year 2012:
timeperiod:{2000 TO 2012}
Two additional property names can be used to explicitly choose an operation: timeperiod.intersects and timeperiod.within. Note, there is no REST field with this name, but instead the '.intersects' and '.within' extends the timeperiod to be specific. This is especially useful in CS-W, for example:
<ogc:PropertyIsBetween>
<ogc:PropertyName>timeperiod.within</ogc:PropertyName>
<ogc:LowerBoundary>2000</ogc:LowerBoundary>
<ogc:UpperBoundary>2012</ogc:UpperBoundary>
</ogc:PropertyIsBetween>
A document can have many time intervals. Each time interval is stored within a pair of boundary fields. A boundary (Lucene NumericField - Long) represents milliseconds since the epoch (January 1, 1970, 00:00:00 UTC). Upper boundaries represent the final millisecond of the interval. For the time interval 2011, the first millisecond of 2011 is stored as the lower boundary, the final millisecond of 2011 is stored as the upper boundary.
- timeperiod.l.0 timeperiod.u.0 - lower/upper boundaries for the document
- timeperiod.l.1 timeperiod.u.1 - lower/upper boundaries for interval [n]
- timeperiod.l.2 timeperiod.u.2
- timeperiod.l.3 timeperiod.u.3
- timeperiod.l.[n] timeperiod.u.[n]
- timeperiod.imeta - metadata per time interval
- timeperiod.meta - summary metadata
- timeperiod.num - the number of intervals for the document (NumericField)
- isDeterminate
- is1Determinate
- isIndeterminate
- is1Indeterminate
- isUnknown
- hasDeterminate
- hasUnknown
- hasNow
- hasLowerNow
- hasUpperNow
- wasInvalid
- wasEmpty
Some additional queries: Return all documents that have 2 or more intervals:
timeperiod.num:[2 TO *]
Return all documents that have fully determinate intervals:
timeperiod.meta:isDeterminate
During the process of indexing a document, values associated with the property timeperiod.analyze will be analyzed to determine the time period intervals for the document. The values for analysis are expected to follow a specific format.
For FGDC documents:
- tp.position.date.fgdctime.time
- tp.begin.date.fgdctime.time.end.date.fgdctime.time
<property xpath="/metadata/idinfo/timeperd/timeinfo/mdattim/sngdate">
<property meaning="timeperiod.analyze" xpathType="STRING"
xpath="concat('tp.position.',caldate,'.fgdctime.',time)"/>
</property>
- tp.position.datetime.indeterminate.value
- tp.begin.datetime.indeterminate.value.end.datetime.indeterminate.value
<property xpath="//gml:TimePeriod">
<property meaning="timeperiod.analyze" xpathType="STRING"
xpath="concat('tp.begin.',gml:beginPosition,'.indeterminate.',gml:beginPosition/@indeterminatePosition,'.end.',gml:endPosition,'.indeterminate.',gml:endPosition/@indeterminatePosition)"/>
</property>
Recognized indeterminates are the following words when put in metadata time elements:
- unknown
- now
- present
- after
- before
- now and present are equivalent.
- Documents containing an unknown indeterminate will not be matched by the "within" operation.
- Documents containing multiple intervals plus a now or present indeterminate will only be matched by the "within" operation if the intervals are sequential and the indeterminate has been declared for the end of the highest range. For example: 2008-08-01..2009-08-31, 2009-09-01..2010-04-15, 2010-04-16..now
- Documents containing an after or before indeterminate will not have their time periods indexed by default.
The following parameters can be configured within gpt.xml
- allowAfterAndBefore - if true then accept dates associated with these indeterminates (default=false)
- allowOpenEndedRange - if true then treat an open ended range (empty or 'unknown') as a single date (default=true)
- timeperiod.maxIntervalsPerDocument - the maximum number of intervals to index per document, documents that exceed the maximum will not have their time periods indexed (default=50)
<parameter key="timeperiod.allowAfterAndBefore" value="false"/>
<parameter key="timeperiod.allowOpenEndedRange" value="true"/>
<parameter key="timeperiod.maxIntervalsPerDocument" value="50"/>
Example REST URL:
http://your_server/geoportal/rest/find/document?f=xjson&searchText=timeperiod.imeta%3Adeterminate.0