Remove fulltext index on posts #828

user12986714 · 2020-11-21T02:38:34Z

Reference: Do we really need fulltext index on metasmoke?

ArtOfCode- · 2020-11-21T21:37:08Z

[status-declined]. If it's a LIKE search you need, tick the LIKE box. The fulltext index saves vast amounts of time and server processing for basic searches.

user12986714 · 2020-11-22T01:15:15Z

@ArtOfCode- The issue with the fulltext index is not that it isn't blazingly fast or that it does not support CJK characters, but is that fulltext search is not doing what people (generally) mean. If I search bbb ccc ddd on metasmoke, I expect posts containing bbb ccc ddd. However, the fulltext search will give posts containing one of bbb, ccc or ddd.

Moreover, when people search spam, they expect posts containing string spam. However, in fulltext search, if a post only contains \bspammer\b but not \bspam\b, it will not be matched.

Consequently, the way fulltext search operates is very confusing and will be a surprise for most of the users. It operates more like Google - when we search best spam in stackexchange on google, we don't expect pages containing exactly best spam in stackexchange; google would, for example, return a page titled best spam posted on stackexchange.

However, on metasmoke, we have been using our old search system for years, and thus users expect their queries to return exactly posts containing exactly best spam in stackexchange. It is obvious from recent complaints of search inaccuracy that those users didn't seen the difference in how search operated in the past and how it operates now.

This is not about fulltext index not supporting CJK; even if all our posts are in ASCII this is still a problem.

As a real example:

If I search nullpointerexception, I get 191 results.
If I search java.lang.nullpointerexception, I get 7062 results.

How can it be the case that 2 yields more results than 1? Behind the scene, query string in 2 is tokenized to java, lang and nullpointerexception (in fact, if you search java lang nullpointerexception, you will get the same 7062 results), and any post containing one of java, lang or nullpointerexception will be returned.

ArtOfCode- · 2020-11-22T06:52:03Z

@user12986714 I'm aware of the limitations of FT index searching, but the fact remains that it's saving significant processing time for the simple searches. As I said, if it's the old behaviour you need, tick the box for it - that's why it's there. Removing the index is not the correct solution.

tripleee · 2020-11-24T08:00:07Z

Removing the full-text index may be the wrong solution, but there is a problem and the result is that the behavior of search is basically erratic. Having a FAQ somewhere which explains why it's broken is by far insufficient.

Perhaps an adequate fix would be to replace the LIKE button with a "Full-text search" checkbox which is enabled by default, probably with a slightly more verbose description of how it breaks search.

user12986714 · 2020-11-29T19:04:50Z

Edited FAQ to explain more on FT index.

user12986714 added 2 commits November 20, 2020 21:36

Remove fulltext index on posts

746e1b8

Fix migration

01baa56

ArtOfCode- closed this Nov 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove fulltext index on posts #828

Remove fulltext index on posts #828

user12986714 commented Nov 21, 2020 •

edited

Loading

ArtOfCode- commented Nov 21, 2020

user12986714 commented Nov 22, 2020 •

edited

Loading

ArtOfCode- commented Nov 22, 2020

tripleee commented Nov 24, 2020

user12986714 commented Nov 29, 2020

Remove fulltext index on posts #828

Remove fulltext index on posts #828

Conversation

user12986714 commented Nov 21, 2020 • edited Loading

ArtOfCode- commented Nov 21, 2020

user12986714 commented Nov 22, 2020 • edited Loading

ArtOfCode- commented Nov 22, 2020

tripleee commented Nov 24, 2020

user12986714 commented Nov 29, 2020

user12986714 commented Nov 21, 2020 •

edited

Loading

user12986714 commented Nov 22, 2020 •

edited

Loading