-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why not SQLite instead of YAML templates? #364
Comments
SQLite is a relational database. Those are great for structured data, but not as good for data with less structure. Our templates have many optional fields that would be hard to model in a RDB. So NoSQL of some kind would be more appropriate. Of course this would be harder to manage on Github, copy around and the barrier to start is higher. So I wouldn't go for that unless for a hosted service that has many many templates to manage. For this case, I'd pick a RDB with JSON field support to keep the unstructured data. |
With some very few exceptions (and SQLite not being one) databases are not good for storing big tree like structures. Also how would you like to store that in git? Binary format would be a mess to maintain. Test SQL queries - not much better. I think YAML files fit this project pretty well. |
m3nu:
Yes, of course. For the general case you may have any number of optional fields of data you wish to extract. Naturally, a NoSQL DB like Mongo would be more appropriate than SQLite. My own intended use is therefore clouding my judgement of the general case. My goal is only to identify the issuer, date and invoice number. Very shallow objective. Am I right in thinking configuration of vendor info is a point of ...I don't know, fragility for the package? Getting the templates correct and having feedback on them is critically important to matching and extracting? m3nu:
Then a database should not be a replacement for the template system. Maybe some kind of plugin system allowing any number of solutions for providing these critical settings? Flexibility and lower barrier to start... If not an alternative to templates, then what are people's thoughts on validating templates? I was thinking SQLite would bypass the formatting errors of YAML, and simplify validation. Short of a validation script, how do people envision validating templates in general? |
Well. That was my feedback and interest. Thank you for the replies. |
I'm circling back to my PDF project after working on a Django project this summer. Funnily enough my project was similar to this, where I needed to store many configurations for different clients and vendors so I could auto-parse tabular data. I saved all of the regex in side the Django database. Easy to use the admin to manage it.
Now I'm returning to this project and short of suggesting this would be a great Django package, I'm thinking template configuration would be much easier with an app like DB Browser for SQLite.
Anyone already thinking on these lines? Or tried it?
The text was updated successfully, but these errors were encountered: