-
Notifications
You must be signed in to change notification settings - Fork 708
An example of parent child indexing with JDBC importer
Let's assume you want to create a single index in Elasticsearch with two types, one for parent documents, one for children.
In your DB, you have the following simple tables:
create table parent_table (
id integer,
message varchar(32)
)
create table children_table (
id integer,
parent_id integer,
message varchar(32)
)
insert into parent_table values (1, "I am the parent")
insert into children_table value (1, 1, "I am child one")
insert into children_table value (2, 1, "I am child two")
commit
The two tables should be represented in Elasticsearch by a parent/child
relationship in the index myindex
.
Here are the commands. We are extra careful and delete myindex
so we are
sure we have an empty index.
curl -XDELETE 'localhost:9200/myindex'
We create the children mapping. It requires a pointer in _parent to the parent type.
curl -XPUT 'localhost:9200/myindex' -d '
{
"mappings": {
"mychildren" : {
"_parent" : {
"type" : "myparents"
}
}
}
}'
Now we can run the JDBC importer for the parents
echo '
{
"type" : "jdbc",
"jdbc" : {
"url" : "jdbc:mysql://localhost:3306/test",
"user" : "",
"password" : "",
"sql" : "select \"myindex\" as _index, \"myparents\" as _type, id as _id, message from parent_table",
"elasticsearch" : {
"cluster" : "elasticsearch",
"host" : "localhost",
"port" : 9300
},
"index" : "myindex",
"type" : "myparents"
}
}
' | java \
-cp "${lib}/*" \
-Dlog4j.configurationFile=${bin}/log4j2.xml \
org.xbib.tools.Runner \
org.xbib.tools.JDBCImporter
Note the extra "index" and "type" parameters. There is a flaw in the JDBC importer that requires the definitions.
Now we can run the JDBC importer for the children
echo '
{
"type" : "jdbc",
"jdbc" : {
"url" : "jdbc:mysql://localhost:3306/test",
"user" : "",
"password" : "",
"sql" : "select \"myindex\" as _index, \"mychildren\" as _type, id as _id, parent_id as _parent, message from children_table",
"elasticsearch" : {
"cluster" : "elasticsearch",
"host" : "localhost",
"port" : 9300
},
"index": "myindex",
"type" : "mychildren"
}
}
' | java \
-cp "${lib}/*" \
-Dlog4j.configurationFile=${bin}/log4j2.xml \
org.xbib.tools.Runner \
org.xbib.tools.JDBCImporter
and we refresh the index finally with
curl -XGET 'localhost:9200/myindex/_refresh'
which is not required because JDBC importer refreshes already, but we are extra careful.
Let's search for the parent of the two children indexed:
curl -XPOST 'localhost:9200/myindex/_search?pretty' -d '
{
"query": {
"has_child": {
"type": "mychildren",
"query": {
"match": {
"_all": "child"
}
}
}
}
}'
And the expected result is
{"_shards":{"total":10,"successful":5,"failed":0}}{
"took" : 38,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "myindex",
"_type" : "myparents",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"message" : "I am the parent"
}
} ]
}
}