Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Experiment with dropping stopwords from Lucene
to enable searches like @awmarrs’ “not found attached” - since “not” is currently dropped both from source data and from query strings
- Loading branch information
fc3f8b3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's @awmarrs' search before:
and after (on my local machine):
And the first hit in the "after" contains the exact search string - this screenshot from https://history.state.gov/historicaldocuments/frus1943v01/d888:
fc3f8b3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in good news, this change didn't alter the time required to index the FRUS collection. Deploying hsg-project from scratch took the same amount of time as before - 20m.
fc3f8b3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic!