You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi - thanks for raising this & sorry for the delayed response!
It's likely that the sample doesn't quite support LiLT out of the box yet, but the LiltForTokenClassification.forward() interface is very similar to LayoutLM/etc so I hope it wouldn't be too difficult to add... And would certainly be interested to add it if I have time or anybody wants to raise a PR!
I'm not quite sure from the LiLT bbox doc yet whether the "normalized" (x0, y0, x1, y1) coordinates would need to be handled differently than our current 1000-vocabulary handling for LayoutLM.
We mostly use AutoTokenizer/AutoModel/etc in the training script but there are some departures from this to deal with LayoutLMv2 vs LayoutXLM (which share a lot of tokenizer/processor logic but not quite everything)... So it might be there are some tweaks needed here to correctly handle LiLT as well.
...But as a first pass it's probably well worth just setting model_name_or_path hyperparam to DLVCLab/lilt-roberta-en-base to see how close it is to working already?
I wanted to ask if this solution would currently support Language-Independent Layout Transformer - RoBERTa model (LiLT)?
If not, I wanted to request that the inference code be updated to support a LiLT model.
The text was updated successfully, but these errors were encountered: