Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Replicate Text Classification Results #77

Open
SethPoulsen opened this issue Jul 27, 2023 · 3 comments
Open

Unable to Replicate Text Classification Results #77

SethPoulsen opened this issue Jul 27, 2023 · 3 comments

Comments

@SethPoulsen
Copy link

Hi, I am trying to replicate your Text Classification results so that I can then use your models on my own data set, but I am unable to get any of the text models working at all.

The problem I am running into is that GloVe is outputting a tensor of floats but the embedding layer TextCCT starts with seems to be expecting a tensor of integers. Is there some configuration option I am missing?

This is a follow-up to #73, which I don't have permissions to re-open.

Also in that issue, @stevenwalton mentioned

The insights from our vision work may not be as useful for NLP tasks, where many of these problems don't exist (transformers are quite successful on small datasets without pre-training).

Could you point me to any specific models? I liked your models because they were transformers with low parameter count and showed good performance on small data sets without pre-training. Any other transformers I can find that perform well on a small data sets have huge parameter counts and must have been pre-trained on some huge data set beforehand, which I am trying to avoid if possible (though I am going to try both to compare anyway).

Thanks for your help!

@stevenwalton
Copy link
Collaborator

I'm not quite sure what's going on without looking too closely, but you can see here that we basically only call torch's embedding which expects longs. This just looks like a casting issue to me. Are you double embedding by accident? Or is your input data float instead of long?

The code is pretty straight forward and honestly any embedder should work. The call graph is just embedder -> text tokenizer -> MaskedTransformerClassifier. Modifications should be fairly trivial as all our stuff is in the latter two.

@SethPoulsen
Copy link
Author

I read in the paper that you used GloVe, so I ran the data set through GloVe on my own because I didn't see that happening anywhere in the codebase. The output of that was floats, which doesn't match the longs that are being expected by your embedding layer, as you say.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants