-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question/Discussion: Use of sha value for transactions #37
Comments
Thanks for raising this. And yes, that is confusing... A hash indicates that it is an unique identifier and can be reliably used to identify a transaction and that no different transactions have the same hash. But this is more of an indication and can not reliably be used for that. - it is more a line identifier I guess.. I am not sure what's the best option is. Your first case could be solved by not only creating the sha of the actual line content but also the https://github.com/railslove/cmxl/blob/master/lib/cmxl/fields/transaction.rb#L10 what are you thoughts? On the your second case: this is probably a harder issue as we do not have any state to have a counter/nonce or something there. Sadly the MT format does not have reliable unique identifiers. |
I think your solution to fix first case is fine. This is more or less what I did in my application. The fix should be easy because like you mentioned If you want i can do a PR for that issue. Indeed second case is hard to solve in cmxl. Maybe it is ok adding a warning to readme that this case has to be handled by the application which uses cmxl. If the application is reading through |
Even though I wasn't asked, I would like to contribute some input on this topic, as it's something I've been spending quite some time on in the last couple of days. First, as @TobiTobiM (and myself, and I'm sure others...) has learned the hard way, the only unique ID for a transaction is it's position within a statement. Everything else can (and therefore eventually will) have a duplicate. IMO, this library should:
the trade-offMy impression is that Cmxl has tried to avoid having But for the primary use-case of the library (parsing statements) it leads to everybody rolling their own transaction identifiers and eventually running into the problems encountered above. |
I just want to mention that this is what we were doing, and upgrading |
thanks @grncdr for your input. very valuable. So I think we for sure should add the details to generate the hash. this should be a simple change to the 61 field parser Then we should add the line index to the Field and use it as part of the sha. And the third step is to deprecate the wording Any help implementing this would be helpful as I might be slow currently due to limited time. |
would using that sha of the whole statement + field sha + line index help? or as we try to somehow generate a unique identifier for the field we could allow passing in some identifier value to I am super sorry that that method and confusion caused you problems and wasted your time. |
Wow, this is super interesting to hear since I've been tinkering with a similar issue. There is another thing to consider which makes the issue a lot more difficult. Since we added MT942 (Vormerkposten aka VMK) to the library, I would expect that a statement passed via MT942 would have the same SHA as the matching statement passed via MT940 so you can match those together. Since the order of statements would not necessarily be the same in both format this would rule out the line index influence. I have not yet figured out how to resolve that issue since it seems there is no real transaction ID passed with MT94X-format unlike CAMT which has an transaction ID matching between VMK and regular statements. @bumi maybe I've overlooked some real transaction ID within the documentation, mind taking a look for yourself, please? |
no there is no real transaction ID in MT9XX. Thus anything we try to do on our side will always be some kind of a hack. - for that reason cmxl also does not provide such an ID though the method we could make an id generator configurable, that gets the the statement field object and line index. So everybody can configure it for custom needs with global input from outside. something like: |
Yes, I meant for In that case you don't actually need the field SHA, since only one thing can be at a given line index. The field SHA might still be useful for reconciling VMK/STA data though. See the last section for that.
Not even close to the amount of time the library has saved us! So please don't read me wrong: we ❤️ this library! 😄
I think the combination of statement SHA + line index should be globally unique though! Hence my suggestion to call it The problem with using the statement hash
I can guarantee they're not the same. In fact, transactions aren't even grouped together in the same statements in each format. Unfortunately, I also can't leave out the line index entirely (it's needed to handle the pathological duplicate transaction case) which leads me to... if hashes aren't working, you aren't using enough of them 😉If you want to build a system that reliably stores and deduplicates transaction data from both MT942 and MT940 (referred to as VMK and STA below). A transaction needs the following IDs:
The transaction would then have 2 unique composite ID's: ... banks 😅 |
A further note about |
I used the value given by the sha method in statement and transaction to find them in a database.
This works fine for me but some weeks before i thought i lost some transactions in my database.
In fact they were all there but the sha hashes were the same.
2 cases
First case:
Debit transfer with same day, same amount, same receiver account only transaction information differs (invoice number)
Sha hash is the same because all fields in :61 are identical. The difference is in :86
My quick fix is i build my own sha from :61 and information from :86.
Second case:
Credit transfer all values identical. Sender made accidently same transaction twice the same day. This case is really rare but happend in real world. My fix for this i built also my own hash and add a increment to raw transaction data (source).
So my questions to discuss:
Are the hashes meant to be used to identify transactions?
If yes should cmxl handle these rare cases or should the piece of software which uses cmxl handle this?
The text was updated successfully, but these errors were encountered: