Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AE affects APC #26

Open
yassmine-lam opened this issue May 4, 2021 · 14 comments
Open

AE affects APC #26

yassmine-lam opened this issue May 4, 2021 · 14 comments

Comments

@yassmine-lam
Copy link

Hi,

thank u for sharing ur code with us. As I understand, the results of APC are affected by those of AE aren t they ?
you use the extracted aspect terms to identify the sentiment polarities instead of using gold terms but what if the results of AE are very low and they hardly affect the APC performance?

Thank u

@yangheng95
Copy link
Owner

Yes, but the impact on apc should be limited. This is an emprical conclusion and you can conduct experiments if you want.

@yassmine-lam
Copy link
Author

Thank u for ur reply

I tested this model with a dataset in another language than English and Chinese. When I used the multilingual bert model I achieved high results, but when I used a monolingual model, I obtained very low results (F1-score = 0 for ATE task !!!), which is very weird. Normally the monolingual models are better than multilingual models as they have a larger number of vocabularies no?
Do u have any idea plz?

thank u

@yangheng95
Copy link
Owner

yangheng95 commented Aug 4, 2021

Which pretrained model dou use and can you share any visualization of this preoblem (e.g., code block)?

@yangheng95
Copy link
Owner

Note that this repo is hard coded to use BERTPretrainedModel and tokenizer, you may need to alter to use AutoModel and autotokenizer instead.

@yassmine-lam
Copy link
Author

yassmine-lam commented Aug 6, 2021

Hi,

I replaced the multilingual bert model by this model aubmindlab/bert-base-arabertv01 and I also used AutoModel and autotokenizer in ur code

As I said it gave me 0 for ATE and a low accuaracy for APC

Screen Shot 2021-08-06 at 8 18 30 AM

Thank u

@yangheng95
Copy link
Owner

I dont have the dataset to debug, did you design the dataset as provided format? I received a similar report which is cuased by mis-annotation and label usage.

@yassmine-lam
Copy link
Author

Yes, u were right; there was a problem with the data format. I fixed it, but the accuracy is still very low using the monolingual BERT model compared to the multilingual one.

I really cannot understand that because the monolingual models are generally better than multilingual ones

Do u have any idea plz?
thank u

@yangheng95
Copy link
Owner

Hi,
I suggest you share your code on Github so I can review it. otherwise I might have no idea where the problem comes from.

@yassmine-lam
Copy link
Author

Thank u for ur effort to help us fixing errors. I am working on google colab. So I shared with u the notebook and the folder of code (my email address: [email protected]) to allow u to reproduce the results.

Thank u again for ur effort.

@Astudnew
Copy link

Do you solve the problem?

@yangheng95
Copy link
Owner

Hi,
Unfortunately, I am working on improving PyABSA, this repo is kind of out of maintance, you can try PyABSA which solve some problem about dataset. Or you can provide me with a cut of your dataset so I can analyze it.

@yangheng95
Copy link
Owner

I click the close button accidently, and look forward to your reply.

@yassmine-lam
Copy link
Author

@Phd-Student2018 No not yet you?

@yangheng95
Copy link
Owner

There is no known error found in your data, maybe you can debug via pycharm, etc. To see what happened in tokenization (I suspect the problem is tokenization, or using incompatible tokenizer and model)

@yangheng95 yangheng95 reopened this Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants