Skip to content

Asynchronous queries

Dave Reynolds edited this page Oct 6, 2015 · 2 revisions

Outline of asynchronous query support

Request an asynchronous query with parameter _async.

The query will be queued for future execution. Queue mechanism pluggable (QueueManager). Plugins for local in-process queue and for AWS SQS.

The response is a redirect to a status page such as /async/{id}.

The id would be a collision-free hash of the query so that cached or precomputed versions of the "same" query would return the same id.

The status page can return JSON or HTML given status and, when ready, download link.

Status can be one of queued, preparing, ready.

When ready then status page (whether JSON or HTML) will include URL for download. Possibly HTML should direct to the download?

Caching and results

Query results put in distributed store so can download from any front end server. On AWS this will be S3.

The download link can be a direct S3 link.

Each query id would correspond to an S3 folder containing:

File Purpose
status.json Status of the processing, download link, same format as status page JSON response.
query.json Normalized serialization of the request query
download.{format} The query result, possible in multiple serializations

A flag _persist in the query could force the query result to be cached persistently. This is a means to precompute results and have them hang around even if the cache gets cleared. Could be implemented by putting under different S3 folder tree.