-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add generic importer source #62
Conversation
that imports transactions from beancount.ingest.importer.ImporterProtocol subclass importers
Pull Request Test Coverage Report for Build 161
💛 - Coveralls |
is_posting_cleared should check for source_desc and use that to determine if a posting is cleared. This avoids clearing manually entered postings and helps reconcile them
I am still facing an issue with manually entered transactions.
and entry from statement imported with this source
These are matched as expected, but 1st line of both the candidates is This seems to work fine for other sources in your manually_entered example. what am I missing? @jbms |
At the moment there is no indication in the transaction itself whether the narration was manually entered or automatically generated. I'm not sure how to represent that in a way that doesn't create other problems. Instead, beancount-import uses a heuristic that if merging an existing transaction in the journal with a new transaction being imported, then the narration from the journal is retained. However, if both transactions are already in the journal (e.g. one manually entered and one imported with the Expenses:FIXME account left uncorrected), then currently the narration of one will be chosen arbitrarily. (See matching.py combine_transactions_using_match_set) Possibly a different heuristic could be used in that case, e.g. preferring the narration from the transaction with the fewest Expenses:FIXME accounts. Maybe a different heuristic could be used, like checking whether the narration is the same/similar as another metadata field. |
Thanks for this contribution! Overall this looks good. Are you able to provide some tests, possibly based on examples in the beancount repository under beancount/ingest/importers? That would be really helpful in validating that it works an preventing it from breaking in the future. The tests are relatively easy to create based on the existing framework --- you can refer to e.g. source/mint_test.py |
I will try this out. and will try to write tests too. Thank you!!! |
I was using Scenario 1:I have 2 very similar transactions in my statement
but only 1 cleared transaction in journal (say I reconciled midday on 01/01/2020 and then this occurred the next time I tried reconciling). Scenario 2:There are 2 exactly same transactions imported from the statements (i.e., same hash)
and one cleared transaction in the journal and optionally 1 uncleared(manually entered) transaction.
I am checking the number of cleared transaction that are similar, and have same Will add these as tests |
source_desc is used for checking and clearing postings. this is the way!!
d542346
to
bea4b47
Compare
Thanks for adding the tests! This is getting very close --- potentially could be simplified quite a bit by re-using more functionality from description_based_source.py, unless you think there is a reason why something different needs to be done here. |
3rd revamp. hashing now replaced with _get_key_from_imported_entry date added to posting.meta tests updated test_invalid added
I was trying to keep things more close to beancount by using hashing instead of |
Looks great, thanks! The only thing remaining is to fix the mypy type annotation errors: https://travis-ci.com/github/jbms/beancount-import/jobs/369680675#L617 |
8dfd2db
to
57bfabd
Compare
57bfabd
to
c3f7559
Compare
mypy errors fixed. But the basic_test is failing in python 3.5 but passing in 3.6 and 3.7 |
It looks like the issue is that the order of the entries happens to differ in Python 3.5 --- in Python 3.5, iterating over a dict returns elements in a random order. In python 3.6 and later, it is guaranteed to be in insertion order. To ensure a deterministic result, you could use OrderedDict instead of defaultdict. Unfortunately you lose the convenience of defaultdict then, but you can just use the setdefault method which is almost as convenience. |
cd3f858
to
7450bce
Compare
Fixed!! |
Thanks! One other thing would be to add this source to the list of sources in the README.md file, and to expand the comment at the top of the file to include for detailed instructions on how to use it, like the other data sources have, but that can be a separate PR. |
that imports transactions from
beancount.ingest.importer.ImporterProtocol
subclass importersperforms deduplication w.r.t journal entries and across multiple importable files
tested on my own importer. will be testing more in the following week.
should work as is with #60
It's my first OSS contribution, so opinions are most welcome @Zburatorul @jbms and others
Closes #18, Closes #23