-
Notifications
You must be signed in to change notification settings - Fork 118
Optimization Examples Using Ngt Command
Further improvement of the search performance can be achieved by additional processing as follows.
The default number of initial edges for each node in a default graph (ANNG) is a fixed value 10. To optimize the number, first, the objects should be inserted without building indexes as follows. Although the number of the initial edges can be specified with "-E", for the optimization, graph and tree indexes are not built by specifying an invalid value 0.
ngt create -d 300 -D c -E 0 fasttext.anng objects.ssv
Next, execute the optimization command as follows. This command extracts the optimal number of initial edges for the inserted objects and just set the number into the index profile.
ngt optimize-#-of-edges fasttext.anng
Then, build indexes as follows. Although this command is for appending objects, when any object file is not specified, the existing objects in the object repository is indexed into the indexes.
ngt append fasttext.anng
The next command finally optimizes the search parameters about the explored edges and memory prefetch for the existing indexes. This command does not modify the index data structure.
ngt optimize-search-parameters fasttext.anng
Since an accuracy table which converts expected accuracies into epsilons is generated as well, an expected accuracy value instead of an epsilon value can be specified for search as follows after this optimization.
ngt search -n 10 -a 0.9 fasttext.anng queries.ssv
Above-mentioned ANNG optimization will bring adequate search performance improvement for ANNG. Refining ANNG (RANNG) also improves the search performance, because it improves accuracy of neighboring nodes for each node by searching with each node. However, it is optional because it needs a long processing time. The following command to refine an ANNG must be executed after building the indexes.
ngt refine-anng fasttext.anng fasttext.ranng
Since further more improvement of the search performance by using an ONNG can be expected, how to generate ONNG is described.
ONNG generation requires an ANNG with more edges than default initial edges as the excess edges are removed to optimize the graph. ONNG requires an ANNG at least with edges that are optimized as mentioned above. As a result, an ONNG from the ANNG can provide almost the best performance. However, if the best performance is needed, a larger number than the optimized number may be specified. The following example shows the creation of an ANNG with 100 edges.
ngt create -d 300 -D c -E 100 -S -2 fasttext.anng-100 objects.ssv
An ONNG can be converted from an ANNG with the reconstruct-graph commands as follows to improve the search performance further. Please note that objects should not be removed from ONNG, because it causes inconsistency in the indexes.
ngt reconstruct-graph -m S -o 10 -i 100 fasttext.ranng fasttext.onng
Here, -m S means that pass adjustment and dynamic degree adjustment optimization are executed, and -o and i specify the number of outgoing and incoming edges, respectively. In this example, the number of incoming edges is the same as the number of edges for ANNG creation, i.e., 100. However, since the number of real edges of an ANNG is more than the specified number, a larger number of incoming edges can be specified.
An ONNG can be searched in the same manner as an ANNG as follows.
ngt search -n 10 fasttext.onng queries.ssv
Even if the search time increases compared with the result of the first ANNG, the search accuracy of the ONNG should be higher. Therefore, since the search_range_coefficient for an ANNG needs to be increased to acquire higher accuracies, the search time of an ANNG increases more than that of an ONNG.
Command line tool
Python
C++