Description
The driver doesn't generate optimal AQL queries.
For example the easiest Gremlin query:
g.V().hasLabel("person").next()
In a graph, which has 3 vertex collections
Generates following AQL Query:
(FOR v in UNION(
(FOR x1 IN @@col1 RETURN x1),
(FOR x2 IN @@col2 RETURN x2),
(FOR x3 IN @@col3 RETURN x3 )
)
RETURN v
)
against db, with bind vars: {@col1=documents, @col2=person, @col3=prison}
This will return ALL vertices in ArangoDB (which might be a significantly unperformant for huge collections), and then the hasLabel("person") is parsed in Java.
However it could be easily transformed to:
FOR v IN person
return v
Where only vertices from the requested collection are returned.
However it all depends if we are able to modify how gremlin executes the Steps (e.g. some Custom Traversal?) or if we are able to get the query into Arango driver...
EDIT: it should e possible using:
Traversal Strategies: A TraversalStrategy can be used to alter a traversal prior to its execution. A typical example is converting a pattern of g.V().has('name','marko') into a global index lookup for all vertices with name "marko". In this way, a O(|V|) lookup becomes an O(log(|V|)). Please review TinkerGraphStepStrategy for ideas.