Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LILT #40

Open
tmpuch opened this issue Sep 7, 2023 · 1 comment
Open

LILT #40

tmpuch opened this issue Sep 7, 2023 · 1 comment
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@tmpuch
Copy link

tmpuch commented Sep 7, 2023

I wanted to ask if this solution would currently support Language-Independent Layout Transformer - RoBERTa model (LiLT)?

If not, I wanted to request that the inference code be updated to support a LiLT model.

@athewsey athewsey added enhancement New feature or request help wanted Extra attention is needed labels Sep 24, 2023
@athewsey
Copy link
Contributor

Hi - thanks for raising this & sorry for the delayed response!

It's likely that the sample doesn't quite support LiLT out of the box yet, but the LiltForTokenClassification.forward() interface is very similar to LayoutLM/etc so I hope it wouldn't be too difficult to add... And would certainly be interested to add it if I have time or anybody wants to raise a PR!

I'm not quite sure from the LiLT bbox doc yet whether the "normalized" (x0, y0, x1, y1) coordinates would need to be handled differently than our current 1000-vocabulary handling for LayoutLM.

We mostly use AutoTokenizer/AutoModel/etc in the training script but there are some departures from this to deal with LayoutLMv2 vs LayoutXLM (which share a lot of tokenizer/processor logic but not quite everything)... So it might be there are some tweaks needed here to correctly handle LiLT as well.

...But as a first pass it's probably well worth just setting model_name_or_path hyperparam to DLVCLab/lilt-roberta-en-base to see how close it is to working already?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants