New feature for update felxibility #2121
Closed
HenrikJannsen
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The problem
To deploy changes to existing network data is challenging as we use the hash of the message in several areas:
We use the serialized data from our
protobuf
object as the input for the hash. So all data which are included in theprotobuf
obejct have impact on the hash.Concrete problems
We had 2 cases where we already hit those restrictions which avoided that we can change or fix existing data.
We use the
MetaData
class for the max size for the storage map. This was unfortunately set too small for the mailbox messages (100 entries). This caused in the first release the problem that mailbox messages got dropped once the map was full.We could not simply change the
MetaData
parameters as that would have broken the hash.We fixed this by temporarily deactivating the check for the map size until a solution to the problem is deployed.
Another issues is that we use the byte array of the payload as input for the Proof Of Work (pow) and store that with the pow object. This is unnecessary large and the hash of the payload would be a better solution and reduces the size of the pow data transmitted with each message. To fix that we could implement a new
AuthorizationToken
type and use theFeature
object to signal whichAuthorizationToken
types we support. If both peers support the newAuthorizationToken
type then we could use that.The list of
Features
is part of the initial connection handshake where nodes exchange whichCapabilities
. This is itself a tool for update flexibility but unfortunately it got blocked by the changed hash in case we change the list of supportedFeatures
.The solution
So we needed a new solution to add more flexibility to the way how we create the hash.
This is now done by the
@ExcludeForHash
annotation, which can be added to any field in a class which has aprotobuf
representation.We use the
serialize
method which builds theprotobuf
object and takes the byte array of it. Now we added a new method:serializeForHash
which passes a parameter (serializeForHash
) to thegetBuilder
andtoProto
methods. at thetoProto
method we check if serializeForHash is true and if so we clear all fields annotated with ExcludeForHash in the builder, thus the default value for the type of the excluded field will be use in theprotobuf
serialization. We have to apply that recursively to all child objects.By that, we can define which fields should be excluded for the hash creation. Basically we can exclude all non-identity defining data, though it was applied only to the more obvious candidates which might undergo changes (like
MetaData
).When we add a new field we can annotate it and it will by that not break backward compatibility.
In case the newly added field is relevant for the objects identity that approach might not be suitable but usually identity is already defined by the existing fields.
Deployment
To deploy that is challenging as we cannot just add the annotation without breaking the not updated nodes.
We use an activation date from which on the annotation will be used for clearing the field.
Though as soon that date kicks in (set for May 12, 00:00) the nodes cannot communicate with other nodes which have not udpated already. If that switch works smoothly for running applications cannot be guaranteed. There might be connection drops or exceptions thrown.
To mitigate that risk we could show a popup to running applications to ask them to restart the app to ensure that there are no issues when the activation date enables the new feature.
We need to ensure that the seed nodes and oracle nodes get restarted just after that moment to ensure newly connected nodes will get accepted.
There will be likely issues with removing existing data (like offers or chat messages) as the hash of the AddDataRequest does not match anymore. The
Time To Life
will clean up that after expired date (10 days for offers/chat messages).The time stamps for user profiles will likely become invalid and all are reset to 0.
I have not tested and investigated all the potential issues which might arise. I fear we cannot avoid some minor issues, but at least existing offers should still work and trading should work.
One idea might be to deactivate pow check for the transition period to reduce network disruption. I need to test what impact that would have.
Feedback?
If you have any questions, comments or ideas how to improve the deployment process please get in touch!
Beta Was this translation helpful? Give feedback.
All reactions