-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Control protocol content header #41
Comments
AFAIK, the current proposal is to include
I think we should be able to use fixed-length byte sequences for all of those, which would allow us to decode this header by length/position. Question 1: If we include a conversation ID, do we also need the message ID, or would it be sufficient to track request/response/response/... patterns/sequence by conversation ID alone? Question 2: What should we use as conversation/message ID? uuid was proposed, but might be wastefully long? We can consult our goals to get OOM estimates of the (unique) message counts we need to keep track of. Question 3: Do we want to include a timestamp? If yes, maybe folded into the message ID (e.g. UUID7 or UUID5), in which case we would include the message ID? |
If we have a conversation ID, we do not need a message ID for tracking responses etc. However, we loose the option to identify every message (and the timestamps in the message IDs).
2 Bytes should be sufficient, as it is only relevant for the requesting party, which should not open more tan 65000 conversations at once. |
Don't we want conversation IDs to be unique for ~a whole session? That would make it ~trivial to filter e.g. all messages belonging to one conversation out of a log file/database. edit: If I computed correctly using https://kevingal.com/apps/collision.html, 16 bits of ID, at 100 concurrent conversations (1 per device, order of magnitude estimate, e.g. 33 new conversation per second at 3 second average conversation duration), gives us a 7.3% probability of a collision! Waay to high imo. |
There are two ideas:
COAP follows the second idea:
(https://en.wikipedia.org/wiki/Constrained_Application_Protocol#Token) |
I thought, that a Component (let's say a Director) has an internal conversation counter, so it can have 65000 (256*256) conversations until it starts from zero again. |
Here is the COAP protocol definition: https://www.rfc-editor.org/rfc/rfc7252 I read their concept:
|
I proposed to include the content formatting type (json, avro, binary), not the message type. However, we can think about the message_type in the header. |
Oh, I see one difficulty with the conversation_id: If both endpoints send each other a message with the same conversation_id (as both chose the same ID due to some circumstance), they will interpret the others message as a response, not a new request. We could mitigate that, if the message type differ (request type is different from response type). |
These points look useful/applicable in our situation. Token is basically our conversation id (in role). |
With "message type" I mean the command verbs (#29). We need to have those somewhere, I thought the content header would be the logical place. Serialisation information will also be important, in case we use more than one scheme. |
Basically, yes. We can't use sequential codes because two Components might be on the same "offset" concurrently. If we use random ones, we have a certain risk of collision, i.e. two nodes accidentally choosing the same ID via RNG. By adjusting the length/complexity of the ID scheme against the expected message ID generation frequency, we can tune the collision risk to acceptable levels. I'm not sure message type helps us with this, because message type is not a "random" variable. |
It helps: If I receive a "response type message", I know, it is the response to my request with the same conversation_id. "If I receive a "request type message", I know it is a new request and I have to return a response with the same conversation_id. |
Is your argument that by having two kinds of messages, this improves the odds of a collision by a factor of 2? |
I'd say there are no collisions anymore: Yes, several messages with the same conversation Id might arrive, but only those of "response type" are a response to my request. As the original sender sets the conversion ID, it can determine, what it gets back (just using "free" id's). All the other messages with the same if have to be requests, and therefore this I'd is not relevant for the Components under scrutiny. More clear:
So, we have a combination of two filters: message type has to be "response" and conversation Id has to match. Only then, we got the response to our request. |
So if Co1 sees/routes two "response type" messages from CA->CB and from CE->CF, which could have the same conversation id because they originate from different Components, what happens then? I'm also thinking of e.g. logging streams, it would be practical/simple if the invariant "one conversation id <==> one conversation" would always be fulfilled (without further logic/analysis). |
I see the conversion ID as a help for the end points of a conversation (especially the requesting one). The Coordinators routing do not care, whether the routed message is of one or another conversation. For logging purposes: you need the combination of recipient/sender and conversation Id to get the "full conversation Id". I see it difficult to achieve collision less conversation Id without a central authority. I see, that the stream logging is very important to you (I did not think about it). Oh: logging has another difficulty: you have to combine the logs of all Coordinators to get a full message log. |
I'm thinking of the poor folks that have to troubleshoot future messaging/routing problems. :-D
Yes, just like you can't achieve collisionless git hashes. What you can do is lower the probability of a collision until you're comfortable (number TBD, but much less than 7% :-p) |
In my test system I use the conversation_id more and more, but not the message_id. We could include the timestamp in the conversation_id and the message_id consists of the conversation_id and a temporal offset to the begin of the conversation. So:
Advantages:
Disadvantage:
|
At a first glance: sounds reasonable. I haven't thought deeply on this, yet. |
With the resurfaced links in #16 my proposal for the content header:
|
Talking about PyMoDAQ in combination with LECO, we considered it good to send binary data, therefore it is good, that we have that byte indicating the serialization scheme. |
Regarding the serialization scheme byte: We could allow a certain range (let's say 127-255) for user defined applications |
In #33 we decided upon one frame for header information.
In this issue, we can discuss the content of that header frame.
The text was updated successfully, but these errors were encountered: