Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Bad outcome in ja-en task #184

Open
josaphjosta opened this issue May 11, 2021 · 1 comment
Open

Bad outcome in ja-en task #184

josaphjosta opened this issue May 11, 2021 · 1 comment

Comments

@josaphjosta
Copy link

Using provided wiki.ja.vec and wiki.en.vec, so do the dictionaries. But the words precision seems strange:

INFO - 05/11/21 17:49:31 - 0:07:19 - 1451 source words - nn - Precision at k = 1: 0.000000
INFO - 05/11/21 17:49:31 - 0:07:19 - 1451 source words - nn - Precision at k = 5: 0.000000
INFO - 05/11/21 17:49:31 - 0:07:19 - 1451 source words - nn - Precision at k = 10: 0.137836

More info at
train.log

Please help.

@williammulianto
Copy link

williammulianto commented May 11, 2021

Hi, did you already try using common crawl embedding instead of wikipedia?

The Japanese wikipedia embedding representasion is not really meaningful
See : facebookresearch/fastText#710

Also try decreasing the epoch size to 250k/500k.
If all above doesn't work, please check this paper, in this paper they improve EN-JP alignment precision by 30%

Hope this works, please correct me if im wrong.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants