Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude proper noun result from notification #62

Conversation

Aethelflaed
Copy link
Contributor

First of all, thanks for the app, it's great!

I've just noticed a quirk, where it sometimes display a pseudo-meaning for a proper noun match in the notification:
Screenshot_20240207_133758

This quick patch makes a difference:
Screenshot_20240207_133156

Now I understand that it's a potentially breaking change, as it would suddenly stop finding a meaning if the only meaning that exists is a proper noun.

Maybe I could conditionally enable this behavior with a switch?

We'd probably need a dedicated settings page at this point though, which I don't mind providing in another PR if you want

@tirkarthi
Copy link
Owner

Thanks for the feedback and creating a pull request. This has came up in one of the app store reviews too where human gave the meaning "surname". My initial thoughts would be to use a priority where "Proper noun" has the last priority. Meanings with "Proper noun" form around 10% of the items so filtering them out might filter out genuine cases.

sqlite> select count(*) from dictionary where lexical_category = "Proper noun";
count(*)  
----------
117817    
sqlite> select count(*) from dictionary where lexical_category != "Proper noun";
count(*)  
----------
1041376

Current query where surname is the top result :

select * from dictionary where word = "dell" collate nocase;
id          word        lexical_category  etymology_no  definition_no  definition
----------  ----------  ----------------  ------------  -------------  ----------
49083       Dell        Proper noun       0             0              surname   
49084       Dell        Proper noun       0             1              place: uni
921266      dell        Noun              0             0              A valley, 
921267      dell        Noun              1             0              (obsolete)

Change to use a case statement to have definitions with "Proper noun" as low priority

sqlite> select case when lexical_category like "Proper noun" then 2 else 1 end as priority, * from dictionary where word = "dell" collate nocase order by priority;
priority    id          word        lexical_category  etymology_no  definition_no  definition                                                                                                                                                                                                                                               
----------  ----------  ----------  ----------------  ------------  -------------  --------------------------------------------------------------------------------------------------------------------                                                                                                                                     
1           921266      dell        Noun              0             0              A valley, especially in the form of a natural hollow, small and deep.<ref name= "isbn0-19-861271-0">Brown, Lesley , The New shorter Oxford English dictionary on historical principles , Clarendon , Oxford Eng. , 1993 ,  , 0-19-861271-0 ,  ,  , </ref>
1           921267      dell        Noun              1             0              (obsolete) A young woman; a wench.                                                                                                                                                                                                                       
2           49083       Dell        Proper noun       0             0              surname                                                                                                                                                                                                                                                  
2           49084       Dell        Proper noun       0             1              place: unincorporated community, s/Montana, c/USA

While playing around I also found a way where it can do like match on definitions so this enables pushing down meanings with "(obsolete)" which are mostly outdated and present for historical reasons.

sqlite> select case when definition like "%obsolete%" then 3 when lexical_category like "Proper noun" then 2 else 1 end as priority, * from dictionary where word = "dell" collate nocase order by priority;
priority    id          word        lexical_category  etymology_no  definition_no  definition                                                                                                                                                                                                                                               
----------  ----------  ----------  ----------------  ------------  -------------  --------------------------------------------------------------------------------------------------------------------                                                                                                                                     
1           921266      dell        Noun              0             0              A valley, especially in the form of a natural hollow, small and deep.<ref name= "isbn0-19-861271-0">Brown, Lesley , The New shorter Oxford English dictionary on historical principles , Clarendon , Oxford Eng. , 1993 ,  , 0-19-861271-0 ,  ,  , </ref>
2           49083       Dell        Proper noun       0             0              surname                                                                                                                                                                                                                                                  
2           49084       Dell        Proper noun       0             1              place: unincorporated community, s/Montana, c/USA                                                                                                                                                                                                        
3           921267      dell        Noun              1             0              (obsolete) A young woman; a wench.

Related : #52

@tirkarthi
Copy link
Owner

Another point adding as a note to myself is that I added collate nocase so that even "Dell" can give meanings for "dell" and vice versa. It seems there are two entries in Wiktionary and "Dell" comes up first. I guess there are languages where capitalization/case matters and maybe this needs to be revisited in future.

https://en.wiktionary.org/wiki/Dell#English
https://en.wiktionary.org/wiki/dell#English

@Aethelflaed
Copy link
Contributor Author

I updated the PR with your SQL query giving proper nouns a lower priority, it makes more sense indeed!

While playing around I also found a way where it can do like match on definitions so this enables pushing down meanings with "(obsolete)" which are mostly outdated and present for historical reasons.

That's an idea, but I'm guessing they most likely already appear after the other meanings on wiktionary. Also, in that case it might be better to match (obsolete) and make that a language-dependent string.

Another point adding as a note to myself is that I added collate nocase so that even "Dell" can give meanings for "dell" and vice versa. It seems there are two entries in Wiktionary and "Dell" comes up first.

It makes a lot of sense to me, I didn't even question it. Dell is a very good example indeed, it just came up in a book I was reading and I never would have guessed it could have a meaning :)

I guess there are languages where capitalization/case matters and maybe this needs to be revisited in future.

As far as I know, even when it matters there might be some speaker that do not do it. Also, considering many language also capitalize the first letter of a sentence, the case can probably be safely ignored if we want to provide the user with the definition they are looking for.

For example, a cooking recipe in English might have a sentence like "French the ...", and a user would be looking for french and not French

Copy link
Owner

@tirkarthi tirkarthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I tested the patch locally and it works fine improving results for the users. Thanks @Aethelflaed .

@tirkarthi tirkarthi merged commit 70adf38 into tirkarthi:main Feb 10, 2024
@tirkarthi
Copy link
Owner

@Aethelflaed Since you seem to have tested the patch locally, if you faced any setup or build issues regarding the codebase or general feedback I would be happy to take them. Thanks.

@Aethelflaed Aethelflaed deleted the feature/exclude-proper-noun-from-notification branch February 10, 2024 08:22
@Aethelflaed
Copy link
Contributor Author

@tirkarthi sorry I can't really help you with that! It's my first time working on Android after something like 10 years, I didn't have a very good memory of the build system and it seems it only got worse since then (Android studio is constantly downloading stuff for gradle apparently, it's a real issue for me as I'm traveling and using my phone connection :/)

After I let Android Studio do whatever it is it needs doing (it doesn't seem like there's an easy way of knowing what it does, apart from a few characters at the bottom of the screen), then it just works, so I guess that's not too bad :)

Maybe refresh the readme though, this section is confusing and not needed (although it's nice to have a local copy of the database, but for that you can just copy the link used in the source)

@tirkarthi
Copy link
Owner

Thanks, makes sense. The section was before CDN when the only database supported was English added to assets. I will refresh the whole readme. The changes in this PR will go as part of the next release this month end or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants