-
Notifications
You must be signed in to change notification settings - Fork 39
api
PUT /dbname
will create the dbname
database. Example using curl:
curl -X PUT http://localhost:3133/dbname
GET /_all_dbs
will list all known databases on the server. Example:
curl http://localhost:3133/_all_dbs
yields
["dbname"]
GET /dbname
will give you some info about the database.
Example using curl:
curl http://localhost:3133/dbname
should produce the following output:
{"deleted_count":0,
"doc_count":3329786,
"header_pos":750080000,
"last_seq":3329788,
"space_used":210578761}
In the above case, you'll notice that header_pos
is way beyond
space_used
, indicating this database may benefit from compaction.
POST /dbname/_compact
will clean up extra space in dbname
.
Example using curl:
curl -X POST http://localhost:3133/dbname/_compact
Online compaction works in a (mostly) non-blocking way. If you exceed
a database's maxOpQueue
during compaction, writes to that database
will begin blocking. If you feel that this may affect you, adjust
maxOpQueue
appropriately and/or compact during a low period.
Queries will never be blocked.
DELETE /dbname
will delete the dbname
database. Example using
curl:
curl -X DELETE http://localhost:3133/dbname
To store a JSON document with a system-generated timestamp, just
POST
to /dbname
. Example:
curl -X POST -d @sample.json http://localhost:3133/testdb
If you perform the same post as above, but add a ts
parameter to the
URL, e.g.:
curl -d@/tmp/test1.json 'http://localhost:3133/t?ts=1346189075374651880'
the document will be stored with your user-specified timestamp. Several input formats are available. All of the following will be stored with the same key (± resolution as specified):
-
2012-08-28T21:24:35.37465188Z
- RFC3339 (this is the canonical format) -
1346189075374651880
- nanoseconds since 1970-1-1
-
1346189075374
- milliseconds since 1970-1-1, common in java
-
1346189075
- seconds since 1970-1-1, common in unix -
2012-08-28T21:24:35Z
- RFC3339 -
Tue, 28 Aug 2012 21:24:35 +0000
- RFC1123 + numeric timezone -
Tue, 28 Aug 2012 21:24:35 UTC
RFC1123 -
Tue Aug 28 21:24:35 UTC 2012
- Unix date -
Tue Aug 28 21:24:35 2012
- ansi C timestamp -
Tue Aug 28 21:24:35 +0000 2012
- ruby datestamp
These formats are also available for posting data, but are generally more useful for querying.
2012-08-28T21:24
This will be considered the same as 2012-08-28T21:24:00Z
2012-08-28T21
This will be considered the same as 2012-08-28T21:00:00Z
2012-08-28
This will be considered the same as 2012-08-28T00:00:00Z
2012-08
This will be considered the same as 2012-08-01T00:00:00Z
2012
This will be considered the same as 2012-01-01T00:00:00Z
Querying data in seriesly is generally about determining what happened within a given time window. When formulating a query, there are a few things you need to consider:
- time range
- grouping
- keys
- aggregation
For the quickstarters, I'm going to start an example:
Example query: http://localhost:3133/testdb/_query?from=2012&to=2013&group=3600000&ptr=/data/children/0/data/num_comments&ptr=/data/children/0/data/score&ptr=/data/children/0/data/score&reducer=avg&reducer=avg&reducer=count
This asks to consider data from 2012
to 2013
, grouped by hour. We
pull the average /data/children/0/data/num_comments
and both the average
and total /data/children/0/data/score
. The resulting object will be
in the following form:
{"1346050800000":[1232,2675,1001]}
Query parameters are passed in as regular URL form parameters.
Timestamps are stored in UTC in the following format:
2012-08-27T07:52:05.151331069Z
from
and to
represent starting and ending points. You can use any
time format acceptable for storing samples for querying
(e.g. 1346189075374651880
for nanosecond granularity epoch time or
2012
to indicate the start of the year 2012).
group
is the number of milliseconds over which the data are chunked
together. For example, if you want the hourly aggregations, you pass
group=3600000
.
Field selection (the parts of the document you're interested in) is
done by pairs of JSON Pointer references and reducer function
names. You specify as many pointers as you want as ptr
params, but
each one MUST have a corresponding reducer
specified.
Example:
ptr=/some/sub/field&reducer=avg
Documents can be filtered on an exact string match of a field by
specifying a filter key (jsonpointer) as f
with a corresponding
value of fv
. Multiple f
and fv
pairs may be specified and
documents will match when all filters are satisfied.
Example:
f=/some/field&fv=important
-
any
- pull an arbitrary value from the group -
count
- the number of non-null values in the group -
sum
- the sum of numeric values from the group -
sumsq
- the sum of the squares of the numeric values from the group -
max
- the maximum numeric value in the group -
min
- the minimum numeric value in the group -
avg
- the average numeric value in the group (considering only numeric values in the count) -
c_min
- minimum per-second rate of a counter stat -
c_avg
- avg per-second rate of a counter stat -
c_max
- max per-second rate of a counter stat -
identity
- the entire group verbatim
Here's a sort of visual example of how querying works. For this
query, I've asked for the count
of /a
, the avg
of /c/e
and
the min
of /b/2
grouped in 5 minute windows (300
seconds).
The key in the result does not represent the actual key that will be emitted. I used a human-readable time representation for illustration purposes only. Had this been an actual query, the timestamps would've all been absolute and emitted as the number of milliseconds since UNIX epoch.
If you know an individual document to retrieve, you can ask for it by its single authoritative key. e.g.
% curl http://localhost:3133/1013A51E00000035/2005-07-10T02:38:46Z
{"r": 24.06}
Although often not required, you can retrieve all of a range of docs
using the GET /dbname/_all
. This takes a from
and a to
parameter as described in query above.
The results come back as a single object with the key as the timestamp and the value as stored.