Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize entry filtering based on search script #1054

Open
hahn-kev opened this issue Sep 14, 2024 · 2 comments
Open

Optimize entry filtering based on search script #1054

hahn-kev opened this issue Sep 14, 2024 · 2 comments

Comments

@hahn-kev
Copy link
Collaborator

Right now when filtering entries we just match against the fields we care about. Eg lexeme form, citation form, gloss.

However if the vernacular WS is Thai, and the search script is only latin characters, then we could just skip searching fields that contain Thai based on their writing system. The reverse is also true, if the search text is in the Thai script, then we can skip searching any English text fields.

@megahirt
Copy link
Contributor

megahirt commented Sep 14, 2024 via email

@rmunn
Copy link
Contributor

rmunn commented Sep 23, 2024

If we have a way to display results progressively, i.e. with Svelte stores or Svelte 5 $state variables, then it might be possible to get the best of both worlds: fast results that only look at fields that match the writing system searched for, plus a second slower query that searches all fields in case there are runs of other-language text, such as an English-langauge note containing the text "the word ไก่ is one of the first words that Thai kids learn in school."

The second search results would need to be merged in a way that removes duplicate results, of course. Or, wait, if the second search is set up to search only fields whose writing systems do NOT match the writing system of the input, then it would be guaranteed not to return duplicate results. (There could be multiple matches within one lexical entry, but the user probably wants to see that the text appears in multiple fields within that entry, as that's useful information in the search results).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants