Do not cache term queries. #21566

jpountz · 2016-11-15T15:41:28Z

There have been reports that the query cache did not manage to speed up search
requests when the query includes a large number of different sub queries since
a single request may manage to exhaust the whole history (256 queries) while
the query cache only starts caching queries once they appear multiple times in
the history (#16031). On the other hand, increasing the size of the query cache
is a bit controversial (#20116) so this pull request proposes a different
approach that consists of never caching term queries, and not adding them to the
history of queries either. The reasoning is that these queries should be fast
anyway, regardless of caching, so taking them out of the equation should not
cause any slow down. On the other hand, the fact that they are not added to the
cache history anymore means that other queries have greater chances of being
cached.

There have been reports that the query cache did not manage to speed up search requests when the query includes a large number of different sub queries since a single request may manage to exhaust the whole history (256 queries) while the query cache only starts caching queries once they appear multiple times in the history (elastic#16031). On the other hand, increasing the size of the query cache is a bit controversial (elastic#20116) so this pull request proposes a different approach that consists of never caching term queries, and not adding them to the history of queries either. The reasoning is that these queries should be fast anyway, regardless of caching, so taking them out of the equation should not cause any slow down. On the other hand, the fact that they are not added to the cache history anymore means that other queries have greater chances of being cached.

s1monw · 2016-11-15T15:50:55Z

can we get a simple unittest that tests our policy wrapper?

dakrone · 2016-11-15T15:54:17Z

core/src/main/java/org/elasticsearch/index/IndexModule.java

+    // This setting is an escape hatch in case not caching term queries would slow some users down
+    // Do not document.
+    public static final Setting<Boolean> INDEX_QUERY_CACHE_TERM_QUERIES_SETTING =
+        Setting.boolSetting("index.queries.cache.term_queries", true, Property.IndexScope);


It seems like it would make more sense if the setting being false meant that term queries were not cached. Right now just looking at the name, it seems like setting it to true would mean that term queries are cached.

which is the right thing from a BWC perspective?

which is the right thing from a BWC perspective?

I don't understand, not caching term queries is correct from the BWC perspective?

jpountz · 2016-11-15T17:36:49Z

@s1monw @dakrone I pushed a commit.

dakrone

LGTM

dakrone · 2016-11-15T17:43:59Z

core/src/test/java/org/elasticsearch/index/shard/ElasticsearchQueryCachingPolicyTests.java

+    }
+
+    public void testDoesNotPutTermQueriesIntoTheHistory() {
+        boolean[] used = new boolean[1];


Why the boolean array here?

I needed a wrapper around a boolean so that it could be modified from inside the below anonymous class. I used to use AtomicBoolean in that case, but I have seen others doing that with an array recently which I liked more, so I stole this habit.

s1monw

LGTM 2

s1monw · 2016-11-16T09:01:47Z

@jpountz and I discussed backporting this to 2.4 without changing the default. It would give use some flexibility if users hit problems and it's very very low risk. It will also allow for more feedback regarding the setting. /cc @clintongormley

There have been reports that the query cache did not manage to speed up search requests when the query includes a large number of different sub queries since a single request may manage to exhaust the whole history (256 queries) while the query cache only starts caching queries once they appear multiple times in the history (#16031). On the other hand, increasing the size of the query cache is a bit controversial (#20116) so this pull request proposes a different approach that consists of never caching term queries, and not adding them to the history of queries either. The reasoning is that these queries should be fast anyway, regardless of caching, so taking them out of the equation should not cause any slow down. On the other hand, the fact that they are not added to the cache history anymore means that other queries have greater chances of being cached.

jpountz · 2016-11-16T09:44:31Z

OK I pushed the new setting to the 6.x, 5.x, 5.0 and 2.4 branches, the default being to cache term queries on 2.4 and 5.0 and to not cache them on 5.x and 6.x.

* master: (22 commits) Add proper toString() method to UpdateTask (#21582) Fix `InternalEngine#isThrottled` to not always return `false`. (#21592) add `ignore_missing` option to SplitProcessor (#20982) fix trace_match behavior for when there is only one grok pattern (#21413) Remove dead code from GetResponse.java Fixes date range query using epoch with timezone (#21542) Do not cache term queries. (#21566) Updated dynamic mapper section Docs: Clarify date_histogram bucket sizes for DST time zones Handle release of 5.0.1 Fix skip reason for stats API parameters test Reduce skip version for stats API parameter tests Strict level parsing for indices stats Remove cluster update task when task times out (#21578) [DOCS] Mention "all-fields" mode doesn't search across nested documents InternalTestCluster: when restarting a node we should validate the cluster is formed via the node we just restarted Fixed bad asciidoc in boolean mapping docs Fixed bad asciidoc ID in node stats Be strict when parsing values searching for booleans (#21555) Fix time zone rounding edge case for DST overlaps ...

synhershko · 2016-11-28T14:20:48Z

Ok, so just double checking the default is not to have this optimization enabled for 2.4 - as the release notes say something else...

jpountz · 2016-11-28T14:26:00Z

Indeed, the 2.4 branch got the setting but kept the same default behaviour. I suspect the issue you mention with the release notes is that they use the title of this PR as a description?

synhershko · 2016-11-28T14:28:31Z

@jpountz Likely so.

Also, a bit of context why I care about this one. See my comments in one of the original issues, and in addition we are currently seeing the following with 2.3.2 on a 15GB machine (7GB heap, default query cache settings):

    "query_cache": {
      "memory_size": "20.8mb",
      "memory_size_in_bytes": 21869016,
      "total_count": 1828019410,
      "hit_count": 66825483,
      "miss_count": 1761193927,
      "cache_size": 3666,
      "cache_count": 2960592,
      "evictions": 2956926
    }

The high number of evictions and low percentage of used memory suggests an issue - and I'm wondering if we are experiencing what you've been describing here or could it be something else?

jpountz · 2016-11-28T15:30:34Z

The high number of evictions and low percentage of used memory suggests an issue

The number of evictions might not be an issue: evictions happen naturally with read-write indices because of merges. So if your cluster has some read-write indices, this might just indicate that few filters meet the required criteria to be considered worth caching. If your indices are all read-only however, then this is indeed concerning.

The low memory usage however might be related to this issue indeed.

sundarv85 · 2017-02-23T14:36:43Z

@jpountz I'm using elastic search 5.2.0. I see in your commit that one can enable back the terms query by setting term_queries in index setting to true. In our case, we only have terms in our queries, so we want the terms queries to be cached too. I tried the following gist

https://gist.github.com/sundarv85/5aec5d8a50958bee5891e10d11b87fae

But at the end I do not see the query_cache to increase. Am I missing something here?

elastic#21566 Do not cache term queries.

jpountz added :Cache >enhancement v5.1.1 v6.0.0-alpha1 labels Nov 15, 2016

dakrone reviewed Nov 15, 2016

View reviewed changes

iter

d64b722

dakrone approved these changes Nov 15, 2016

View reviewed changes

s1monw approved these changes Nov 16, 2016

View reviewed changes

jpountz merged commit 00de8e0 into elastic:master Nov 16, 2016

jpountz deleted the do_not_cache_term_queries branch November 16, 2016 09:02

jpountz mentioned this pull request Nov 16, 2016

Increase the size of the history that the query cache tracks. #20116

Closed

clintongormley added the v2.4.2 label Nov 17, 2016

jpountz mentioned this pull request Feb 8, 2017

Look into nested query performance regressions #20797

Closed

sundarv85 mentioned this pull request Feb 24, 2017

term_queries are not cached #23343

Closed

jpountz mentioned this pull request Mar 31, 2017

Cached query/filter performance degradation from 1.7.3 to 2.3.2 #18680

Closed

RickyLau added a commit to ElasticsearchClub/elasticsearch that referenced this pull request Mar 11, 2019

Backport elastic#21566 for es v2.3.3

82133dc

elastic#21566 Do not cache term queries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not cache term queries. #21566

Do not cache term queries. #21566

jpountz commented Nov 15, 2016

s1monw commented Nov 15, 2016

dakrone Nov 15, 2016

s1monw Nov 15, 2016

dakrone Nov 15, 2016

jpountz commented Nov 15, 2016

dakrone left a comment

dakrone Nov 15, 2016

jpountz Nov 15, 2016

s1monw left a comment

s1monw commented Nov 16, 2016

jpountz commented Nov 16, 2016

synhershko commented Nov 28, 2016

jpountz commented Nov 28, 2016

synhershko commented Nov 28, 2016 •

edited

Loading

jpountz commented Nov 28, 2016

sundarv85 commented Feb 23, 2017

Do not cache term queries. #21566

Do not cache term queries. #21566

Conversation

jpountz commented Nov 15, 2016

s1monw commented Nov 15, 2016

dakrone Nov 15, 2016

Choose a reason for hiding this comment

s1monw Nov 15, 2016

Choose a reason for hiding this comment

dakrone Nov 15, 2016

Choose a reason for hiding this comment

jpountz commented Nov 15, 2016

dakrone left a comment

Choose a reason for hiding this comment

dakrone Nov 15, 2016

Choose a reason for hiding this comment

jpountz Nov 15, 2016

Choose a reason for hiding this comment

s1monw left a comment

Choose a reason for hiding this comment

s1monw commented Nov 16, 2016

jpountz commented Nov 16, 2016

synhershko commented Nov 28, 2016

jpountz commented Nov 28, 2016

synhershko commented Nov 28, 2016 • edited Loading

jpountz commented Nov 28, 2016

sundarv85 commented Feb 23, 2017

synhershko commented Nov 28, 2016 •

edited

Loading