You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally posted by dsoft-jvo June 21, 2024
I use this table-transformer code to extract the tables and table structures of invoices. Without adding the --words_dir argument, the result is very satisfactory. From my understanding, the words_dir is needed to add the contents of the found structures to the result, so I tried adding it. After adding one, however, the result is strange. The detected table gets shrunk to a small corner of the image and the table-structures all overlap each other. At first, this seemed like a scaling problem, but after fixing this, the problem persists.
Aside from the visual result, the 'tables_structure' output is also strange when a --words_dir is added. Without --words_dir the amount of rows and columns seems to be constant. When adding the --words_dir, however, the amount of rows and columns varies. Sometimes there are more, sometimes less. The tokens are formatted as described in the docs/INFERENCE.MD document.
I cannot show any actual data or images, as the data is sensitive, but this is what I found during debugging:
Without --words_dir, i.e. tokens=[]:
With a --words_dir, i.e. tokens=[...data...]:
I feel like the problem lies in a misunderstanding I have about the functions of the --words_dir data. I have read the papers, but I feel like I am missing something about that aspect.
Could someone give some further explanation about the use and function of --words_dir? Are the results I am seeing expected? Why, or why not? And if not, how do I go about fixing them?
The text was updated successfully, but these errors were encountered:
Discussed in #182
Originally posted by dsoft-jvo June 21, 2024
I use this table-transformer code to extract the tables and table structures of invoices. Without adding the --words_dir argument, the result is very satisfactory. From my understanding, the words_dir is needed to add the contents of the found structures to the result, so I tried adding it. After adding one, however, the result is strange. The detected table gets shrunk to a small corner of the image and the table-structures all overlap each other. At first, this seemed like a scaling problem, but after fixing this, the problem persists.
Aside from the visual result, the 'tables_structure' output is also strange when a --words_dir is added. Without --words_dir the amount of rows and columns seems to be constant. When adding the --words_dir, however, the amount of rows and columns varies. Sometimes there are more, sometimes less. The tokens are formatted as described in the docs/INFERENCE.MD document.
I cannot show any actual data or images, as the data is sensitive, but this is what I found during debugging:
Without --words_dir, i.e. tokens=[]:
With a --words_dir, i.e. tokens=[...data...]:
I feel like the problem lies in a misunderstanding I have about the functions of the --words_dir data. I have read the papers, but I feel like I am missing something about that aspect.
Could someone give some further explanation about the use and function of --words_dir? Are the results I am seeing expected? Why, or why not? And if not, how do I go about fixing them?
The text was updated successfully, but these errors were encountered: