Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the mask token in AraT5-base? #10

Open
HMJW opened this issue Oct 18, 2022 · 2 comments
Open

What is the mask token in AraT5-base? #10

HMJW opened this issue Oct 18, 2022 · 2 comments

Comments

@HMJW
Copy link

HMJW commented Oct 18, 2022

I can't find any token like <extra_id> or < mask > in the vocab. What is the mask token in AraT5-base or how do I get the mask id with huggingface codes?

@NoraAlt
Copy link

NoraAlt commented Feb 15, 2023

Same question.. please

@AMR-KELEG
Copy link

@Nagoudi @elmadany Could you please advise in this regard?
I need to use the araT5 model in the same way as the below code snippet, but the model is not operating as expected.

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")

input_ids = tokenizer("The <extra_id_0> walks in <extra_id_1> park", return_tensors="pt").input_ids
labels = tokenizer("<extra_id_0> cute dog <extra_id_1> the <extra_id_2>", return_tensors="pt").input_ids

# the forward function automatically creates the correct decoder_input_ids
loss = model(input_ids=input_ids, labels=labels).loss
loss.item()

Am I missing anything?

Thanks 🙏🏽

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants