-
Notifications
You must be signed in to change notification settings - Fork 708
JDBC plugin feeder mode as an alternative to the deprecated Elasticsearch River API
Now that rivers are deprecated, I want to show how to set up an alternative method for indexing data from a JDBC source into Elasticsearch.
The setup is very easy.
In this example, I assume you are on Linux and want to connect to a MySQL DB.
-
Install Elasticsearch as usual
-
Install JDBC plugin with the
./bin/plugin
tool and add a JDBC driver jar as usual. This means, you must install JDBC plugin on all nodes of the cluster and you must restart all nodes to activate the plugin cluster-wide. -
Now we decide not to use the river API, but instead the feeder mechanism, which works by starting a separate JVM.
-
Set a bash variable
ES_HOME
to the home directory of Elasticsearch (if not already exists) -
Create a JDBC feed script directory
mkdir -p $ES_HOME/bin/feeder/
-
Create JDBC feed script
$ES_HOME/bin/feeder/mysql.sh
#!/bin/bash # ES_HOME required to detect elasticsearch jars export ES_HOME=~es/elasticsearch-1.4.4 echo ' { "elasticsearch" : { "cluster" : "elasticsearch", "host" : "localhost", "port" : 9300 }, "type" : "jdbc", "jdbc" : { "url" : "jdbc:mysql://localhost:3306/test", "user" : "", "password" : "", "sql" : "select *, order_id as _id from orders" } } ' | java \ -cp "${ES_HOME}/lib/*:${ES_HOME}/plugins/jdbc/*" \ org.xbib.elasticsearch.plugin.jdbc.feeder.Runner \ org.xbib.elasticsearch.plugin.jdbc.feeder.JDBCFeeder
-
Set execution attribute with
chmod +x $ES_HOME/bin/feeder/mysql.sh
-
Create a log4j.properties file and add it to the classpath, e.g.
$ES_HOME/lib/log4j.properties
log4j.rootLogger=DEBUG, out log4j.appender.out=org.apache.log4j.ConsoleAppender log4j.appender.out.layout=org.apache.log4j.PatternLayout log4j.appender.out.layout.ConversionPattern=[%d{ABSOLUTE}][%-5p][%-25c][%t] %m%n
-
Start feeder with
$ES_HOME/bin/feeder/mysql.sh
and now you will see debug messages how the feeder starts up and tries to connect to the specified JDBC source and Elasticsearch node, performing the heavy work.
Voilà!