-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Update CorrelationId regex to support UUIDv7 #131
base: master
Are you sure you want to change the base?
Conversation
There are also other places where this regex pattern can be seen https://github.com/search?q=repo%3Amojaloop%2Fmojaloop-specification%20%5B1-5%5D%5B0-9a-f%5D%7B3%7D&type=code Also the build fails, which needs some investigation too. |
To avoid breaking changes for existing implementations of FSP Interoperability API, the pattern cannot be changed in version 1.x which this search also includes. It should only be changed for version 2.0. Additionally, as far as I know, the actual pattern to use in 2.0 is still under discussion, so I don't know why this change should be done now.. |
@henrka, the current regex validation is currently blocking the performance workstream from being able to deliver their work in a standard way. |
And why can't they change that locally? I don't want to change this for everyone just because some work stream has some current issue.. |
hi @PaulGregoryBaker and @henrka , my suggestion was to consider this change for v2.0, which will be a breaking change for FX features anyway. And, the PR was raised against the file used for the v2.0 draft as well, so that seems to be alignment. Performance or not, we agree that this cannot happen in 1.x versions, hence will go with v2.0 |
It seems a bit "early" for me to have this in 2.0 now as we haven't decided on which format to use yet. I know that 2.0 is still work in progress, but to me it seems wrong to change this now based on output from a performance work stream without any more analysis, as well as parallel work to investigate the format in #120. |
In my opinion, the CCB as an important body in an open source project, should not be holding up other community contributions unnecessarily. Waiting for a decision on #120 would be doing just that. (Unless that can be resolved in the next few days.) |
In what way is the CCB holding up community contributions? Each change needs to be properly analyzed and approved. The input from the work stream is highly valued, as is any other contribution. How would it help to have a temporary solution in 2.0 that hasn't been properly analyzed by anyone except the work stream? We are at the same time analyzing other solutions in #120, which could mean that another format should be used in 2.0, so I don't fully understand how changing this temporarily just to satisfy the performance work stream contribution for a potentially short time would help. In an API, I value stability more than potentially temporary changes, even if it is for a future version. |
#120 did not take into account the performance implications of CUID2. The readme for CUID2 contains a Note on K-Sortable/Sequential/Monotonically Increasing Ids, which recommends the use of The main claim of CUID2 vs sequential ids is the leak of timestamps, but this is just a generic claim and we should consider that many of the ids we generate are only significant for a short period of time, given the real-time nature of the functionality they are associated with. So this "leak" is not really an issue in our case. This leak is only significant when associated with entities that are not so much real-time related, like account creation, customer creation, etc. Instead of worrying for a leak, maybe better think about improving the logic and restrict any operations that relate to IDs outside of a certain timeframe. The section also recommends the use of cloud solutions and in-memory databases, which is not the inclusivity we are working for and their use is often restricted by regulation. I think the allowed id generators should not be so restrictive, as for example the most important thing we want to restrict is the length, and even this can be probably parametrized during DB creation. Accepting only CUID2 will feel like a win for the cloud providers, not for inclusivity. |
@kalinkrustev There has been a proposal to use CUID2 in #120, but there is no resolution, so to say it did not take into account the performance implications of CUID2 is not really correct (it can't be as there is no final solution yet). Performance has been and is discussed on the SIG meetings, which is why are looking into alternatives of CUID2. Please correct me if I'm wrong, but I don't see any input regarding potential performance implications in #120. You are very welcome to add more background in that issue regarding the performance implications. I fully understand that this has been analyzed from a performance perspective on Mojaloop, but we can't just change formats without any further analysis, as the API has implications on other systems as well. |
@henrka , yes you are right that it is a proposal. I did not express correctly my thoughts, which were meant to say that performance was not considered so far in the comments there. I was wondering where to put my comments, so maybe I need to move them to #120, just I wanted for now to keep the discussion more focused. The thing is that during performance tests, I noticed that UUID regex was only checked in the quoting service, while transfers were able to execute with UUID v7. So whatever we do to fix this situation, there will be a change. Isn't it better to change in the direction of allowing in quotes what was already allowed in transfers and update the docs? Otherwise if we restrict the transfers as per what the docs say, we can break something, that was allowed so far. Are you aware of particular implications on other systems? The change seems to be very small. Allowing more id formats seems to be opening Mojaloop to easier integrations with other systems. The RFC 4122 for UUID v4 is already obsoleted by the RFC 9562, which includes v7. |
@kalinkrustev If I'm not mistaken, it seems like you are referring to the Mojaloop switch implementation when you say "quoting service"? I don't know much about that implementation, I just know of the implementation in one FSP platform (and the API). Unfortunately, just because a change seems small doesn't mean that the implications in surrounding systems is small.. Additionally, we would like to keep the format shorter than 36 characters for it to be able to fit in ISO 20022 (see #120), so UUID 7 would probably not be the best way forward. @PaulGregoryBaker, you had some other suggested ID standard that you mentioned in the last SIG which should be good for performance as well as keeping the ID shorter than 36 characters? I can't seem to remember the acronym currently.. But we should move this discussion to #120 so that it is easier to follow for everyone else. |
OK, let's move the discussion there. I think @PaulGregoryBaker might have mentioned https://github.com/ulid/spec, which is generated in a way similar to UUID v7, but serialized in a better way to take less characters. Just this will be a lot bigger change with more implications. |
Thanks @kalinkrustev, yes that was the one. Changes will be bigger, but as 2.0 is a new major version that should be fine from a version change perspective at least. |
@henrka and @MichaelJBRichards For the question we had: "Is it ok to use this ULID v7: https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-01.html#name-uuidv7-layout-and-bit-order ? The format itself is "Simplified BSD License" and the GitHub repo for the specification says that it is GPL-3.0: https://github.com/ulid/spec, but the libraries we intend to use are of MIT (here: https://github.com/ulid/javascript ) which is compatible with our license Apache 2.0, hence the concern." -- We're ok to go ahead and use the libraries (as long we don't modify them). |
This change will relax the regex validation on the CorrelationId to support UUIDv7 Unique Identifier formats. This change is required to enable performance on the current mojaloop implentation in MySQL.
Current performance tests show that there is a 70% drop off in performance in the soak test when using a UUICv4. This is the default Unique Identifier. UUIDv7 does not have this drop off in performance.
New Identifier for ISO20022: There is an existing search for a new unique identifier that is shorter in length that is required for the ISO20022 API implementation. As the identifier has not yet been selected, and that this issue is currently blocking efforts in the performance workstream, the proposal is to make this adjustment as an interum measure.