Replies: 2 comments
-
Hello @mophilly! Let me test on my side, because what you are saying makes sense, or in this case doesn't. Should get better, and not worse. So let me get back to you on that |
Beta Was this translation helpful? Give feedback.
-
Hello @mophilly! So, i have a lot to tell you! So in 0.1.2 i added the classification to the response. Now you gonna get the correct classification as it should be, instead of the annoying matching. Also, about the problem discussed here, i found the problem while refactoring the changes. In some models (most of them i would guess) the image was not working. As working for the default sonnet that i was working, and was not processing multiple mages. I changed the approach, now classifies each image separately and then picks the best one. Will work now! |
Beta Was this translation helpful? Give feedback.
-
I made my first attempt at using the option vision=True in the classification routine.
I have three classifications, e.g. "property", "lease", and "revenue", that can be determined simply by the presence of specific data elements, like "check number" or "lease date". This works fine.
However for the "revenue" class of document I have identified five common variants that have a different number of columns in the layout. The data extracted is the same for each variant of the class but the order and position of the elements are different. For that reason it seems a unique pydantic contract for each variant is needed.
I thought that enabling the computer vision option might be sufficient to correctly classify each variant. I added an image path for each of the three top level classifications, using a png file of the first page of each document type. I enabled the vision option and tested.
Although no errors were raised, the test failed to classify the documents properly, claiming each of the three test files were the "revenue" type of document. The right answer would be one was revenue. one was lease, and one was property.
disabling the vision option returned the proper results from the test.
I am not sure what question to ask, but could use some advice.
Beta Was this translation helpful? Give feedback.
All reactions