Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine tune vit5-base model for text summarization #6

Closed
MinhDang685 opened this issue Sep 4, 2022 · 6 comments
Closed

Fine tune vit5-base model for text summarization #6

MinhDang685 opened this issue Sep 4, 2022 · 6 comments
Labels
documentation Improvements or additions to documentation

Comments

@MinhDang685
Copy link

Hello VietAI team,

Thanks for sharing the pretrained models in your research paper. I am interested on fine tuning the VietAI/vit5-base language model for the abstractive summarization task. I have some questions:

  1. When I run your example here, unlike @r1ckC139 in Model Checkpoint viT5-base #1 that he has random sequences, I always got a fixed-length (=max_length) unchanged array, I have try to modify the input (with "vi: " / "vietnews: " prefix, and without prefix) but the result is not changed. Could you take a look?
    image
  2. In the fine tuning phase, do I need to preprocess the data by adding a prefix?

Thanks a lot

@justinphan3110
Copy link
Collaborator

Hi @MinhDang685 , for MLM pretraining we used mesh-tensorflow. The models on HuggingFace are ready for finetuning only.

You don't need to add prefix in finetunning.

@MinhDang685
Copy link
Author

Hi @justinphan3110, thanks for your quick reply.

  • Since the model is trained with mesh-tensorflow, can I directly use it to finetune in PyTorch without any adaptation?
  • Could you give this issue (that the model always generates an unchanged sequence) a look when you have time.

Thanks

@justinphan3110
Copy link
Collaborator

@MinhDang685

  • We have just published an example code for finetunning with huggingface

  • Can you double-check again if there is still a generated unchanged sequence issue?

@MinhDang685
Copy link
Author

MinhDang685 commented Sep 14, 2022

Hi @justinphan3110, thanks for your help, I try to generate with the model again and it works now, the output sequences now changes base on the input

I notice that you have updated the model config.json file by removing task specific prefixes, is it the cause of the issue (that I miss the "summarization" prefix before the input to indicate I want the model to perform summarization task)?

@justinphan3110
Copy link
Collaborator

justinphan3110 commented Sep 14, 2022

@MinhDang685 ,
You need prefix vietnews: for VietAI/vit5-large-vietnews-summarization .
For VietAI/vit5-base-vietnews-summarization you don't need any prefix.

You can have a look over the eval scripts with HuggingFace

@justinphan3110 justinphan3110 added the documentation Improvements or additions to documentation label Sep 14, 2022
@justinphan3110 justinphan3110 pinned this issue Sep 14, 2022
@MinhDang685
Copy link
Author

thank you @justinphan3110 for pointing that out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants