Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added Polish brands, stores and labels #1184

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ArturLange
Copy link

What

  • I added regexes to stores and brands popular in Poland and links to popular Polish labels

@ArturLange ArturLange requested a review from a team as a code owner August 28, 2023 09:18
@Jagrutiti Jagrutiti changed the title Add Polish brands, stores and labels feat: Add Polish brands, stores and labels Aug 28, 2023
@Jagrutiti Jagrutiti changed the title feat: Add Polish brands, stores and labels feat: Added Polish brands, stores and labels Aug 28, 2023
@@ -257,6 +257,7 @@ toupargel||Toupargel
tropicana orange juice||Tropicana
tropicana||Tropicana
Tropicana||Tropicana
TYMBARK||Tymbark
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not obvious, but this file is used to match IDs return by Google Cloud Vision to Open Food Fact IDs. Is TYMBARK an ID returned by Google Cloud Vision?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, that wasn't obvious to me, I thought it's supposed to be the way it's written on the packaging.
In that case I shouldn't add those without knowledge about Google Cloud Vision, is that right?

@@ -114,6 +115,7 @@ en:sustainable-palm-oil||sustainable palm oil
en:sustainable-seafood-msc||www.msc.org
en:sustainable-seafood-msc||pêche durable msc
en:utz-certified||utz certified
en:vegan||suitable for vegans
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As vegan is already there you can just add "vegans" as a pattern.

@@ -1,22 +1,49 @@
7-eleven
Action
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum it will introduce many false positive, it's quite a common word.

AhorraMás
AhorraMás||ahorramas
Albertsons
Albert Heijn
Alcampo
Aldi
Aldi||asia green garden
Aldi||balta mare
Aldi||biscotto
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing about false positive maybe.

data/ocr/store_regex.txt Outdated Show resolved Hide resolved
data/ocr/store_regex.txt Outdated Show resolved Hide resolved
@raphael0202 raphael0202 self-requested a review August 31, 2023 08:58
@ArturLange
Copy link
Author

@raphael0202 I removed Netto from store_regex.txt after your comments - it's also a common word on packaging.
Is it ok if I just leave the various own brands of Netto?

Netto||bergio
Netto||corsarro
Netto||happy vege
Netto||miletto
Netto||paradiso
Netto||quallio
Netto||rajski sad
Netto||sztuka mięsa
Netto||toremo

@raphael0202
Copy link
Collaborator

Stores are applied automatically, that's the issue... I'm not super fan of this store system we put in place, I would much rather deduce the store from the brand in Robotoff (see #1225).
If you don't mind I would much rather not include additional stores in this file, as it takes time to clean the DB once false positive are introduced).

@github-actions github-actions bot added the ⭐ top pull request Top pull request. label Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

3 participants