Skip to content

ElasticSearch SQL Plugin Overview

Jo Cook edited this page Jun 5, 2018 · 1 revision

ElasticSearch SQL frontend

Overview

A simple web-based front-end that provides SQL-based access for querying ElasticSearch. This could be used by people who are not so familiar with the basic ElasticSearch query syntax and interface when wanting to build custom queries in the Geonetwork catalogue.

Pros and Cons

  • (+/-) It's relatively easy to install but uses additional technologies (node), which adds to the complexity of a catalogue- there's more to maintain and keep up to date, more to ensure starts up when Geonetwork starts up, and so on.
  • (-) You need to open up another port, which may be a limiting factor for security-conscious organisations- if it's open to the outside world it may need penetration testing.
  • (-) It seems to be possible to create or update indices, although possibly not to delete in the version we would need to use with Geonetwork 3.4.2. This could be dangerous!
  • (?) What would be the impact on the rest of the catalogue of running resource-heavy queries in this interface?
  • (-) The front-end is quite limited, so there is no option to save queries for later use, or even a history. There's an issue in the repository for this but it has had no response.
  • (+) The project does seem to be under active development, with commits in the last few days (as of June 4th 2018)

Installation Prerequisites

Installation

Ensure you install the version that corresponds to the version of ElasticSearch used in Geonetwork. For Geonetwork 3.4.2 this is version 5.6.1

https://github.com/NLPchina/elasticsearch-sql

  • Install the sql front-end as a plugin in ElasticSearch, by running the following command in your ElasticSearch/bin directory using the command-line:

    ./bin/elasticsearch-plugin install https://github.com/NLPchina/elasticsearch-sql/releases/download/5.6.1.0/elasticsearch-sql-5.6.1.0.zip

  • Download the web front-end from https://github.com/NLPchina/elasticsearch-sql/releases/download/5.4.1.0/es-sql-site-standalone.zip and unzip it to the plugin folder that has been created by the above step. Note that it doesn't seem to matter that the version of the web front-end does not match the version of the plugin!

  • Run the command in the command-line to install the web front-end:

    cd site-server
    npm install express --save
  • You will need to change the default port that the web front-end runs as by editing the file sql/site-server/site-configuration.json as the default is to use the port that Geonetwork runs on.
  • You will need to edit your ElasticSearch configuration to allow cross-site scripting by editing the elasticsearch.yml config file in a text editor.
  • Add the following lines at the bottom of the file and restart ElasticSearch:
    http.cors.enabled: true
    http.cors.allow-origin: "*"
    http.cors.allow-headers: X-Requested-With,X-Auth-Token,Content-Type,Content-Length
  • Run the web front-end using by running the following command in the command-line (while in the site-server directory):

    node node-server.js

  • Test the plugin is working by visiting the following URL: http://localhost:9200/_sql?sql=select * from records limit 5 (records is one of the Geonetwork indices. This command selects the first 5 documents, just to check it's working)

  • Visit the web front-end. If running on your local computer this will be on http://localhost:8088 (or whatever port you chose in site-configuration.json)

  • Edit the URL in the top-right corner of the page to point at the URL that you are running ElasticSearch on, which is generally http://localhost:9200

Usage

There are two ways of using the SQL Plugin:

  • Using the REST API, eg using Curl or a URL in a browser. This method is accessible via the URL for ElasticSearch, generally port 9200. See the testing URL above. Using this method will return the results in json format.
  • Using the web interface, accessible at the port you configured in site-configuration.json above.

Show commands

There are some meta-commands that you can use to explore the index, which are documented at https://github.com/NLPchina/elasticsearch-sql/wiki/Show-Commands. These are a wrapper around the "mapping" request in ElasticSearch, see https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping.html

Some useful examples:

  • show *: shows you all the indices in your cluster
  • show records: shows you the mappings for the "records" index
  • show records/_default_: shows you the mappings for the "_default_" type in the "records" index

Additional information

You can run an "Explain" on your query, and also save the results as a comma-separated text file (csv).

To utilise ElasticSearch "scan and scroll", to efficiently retrieve large resultsets, use the following "hint" in your SQL:

/*! USE_SCROLL*/

eg:

SELECT /*! USE_SCROLL*/ * FROM records

The yellow "step forward" and "step to end" buttons are then enabled, otherwise they do nothing.

Clone this wiki locally