-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Component-Coordinator Transport Layer Protocol #32
Comments
When we set up CI (#13 ) we can enable that RTD renders docs for PRs, too (https://docs.readthedocs.io/en/stable/pull-requests.html). I think that should make it possible to at least inspect the results. |
Don't you want to abbreviate, e.g. Coordinator Co1, Co2, and Components C1, C2,... (or CA, CB,...) -- saves a lot of typing and space in the diagrams?
The message "R:"C1.Component". S:"C1.Coordinator". Acknowledge.", I think, should not be an ACK, but the reply from C1.Component2. Otherwise, this communication is now over without C1 getting the reply it is actually interested in? Also, is the namespace of a Coordinator not the same as its name? Or do you want to treat the namespaces differently?
I think the Coordinator should at least ask with
I disagree. The first message has to be a |
That one has a funny hole. In the reply, we are using
I already mentioned the problem with implicitly establishing connections.
I'm not sure. I'm OK with sending heartbeats out regularly, but I don't think one should get a reply back. We should check how other protocols handle this. |
You mean the first message in the shown exchange? Or the first after connection? (I assume the former) We already talked about the additional ACKs, and message symmetry elsewhere, but I'm not through with my notifications, yet.
This feels attractive for single Node setups, to not needlessly prefix the coordinator namespace all the time. Have we discarded the notion that a Coordinator strips its name from a namespace when sending locally? If we allow local namespace-less addresses, we should be consistent:
I think this should then be transparent and consistent for single node setups, even ones that grow into multinode later.
|
In the example communication, I did show only the frames actually sent. But that is not the whole truth: |
IMO, no, ACKs should only (primarily?) be for messages that would otherwise not get a reply. The reception of the reply is the acknowledgement. If no reply comes, you know something went wrong, and can retry and/or notify upstream Components. E.g. sequenceDiagram
CA ->> Coord1: R:"C2.CB". S:"C1.CA". Give me property A.
Coord1 ->> Coord2: R:"C2.CB". S:"C1.CA". Give me property A.
Coord2 ->> CB: R:"C2.CB". S:"C1.CA". Give me property A.
Note over CB: No response/timeout
CB -->> Coord2: <missing message>
Coord2 ->> Coord1: R:"C1.CA". S:"C2.CB". Error: C2.CB did not respond
Coord1 ->> CA: R:"C1.CA". S:"C2.CB". Error: C2.CB did not respond
|
Ah, devil's in the details! All clear! |
I thought, that we could name a Coordinator just "Coordinator", as it is unique in its namespace. Therefore you can always address your personal Coordinator if you do not supply any namespace, regardless of the namespace.
You mean, instead of dropping a name, it sends a "are you still alive?" message. and if no reply arrives, it is removed from the list? Good idea. So you give code a chance to respond, if they forgot their heartbeat.
Due to heartbeats and incoming messages (which you cannot control), you have always the risk to receive another message than between sending a request and receiving a reply.
Yes, but you can already send local messages.
But the user might know the Components name, he wants to connect to. Another question:
With that sentence, I meant, that the Component did not send any heartbeat some time.
I wanted to give an example of "local" communication without specifying the namespace. |
No. The leading period is not necessary, we could decide to drop it altogether. For the data protocol (topic filtering) we should use the full name.
I did not think about that, as I thought, that humans give the names, but that is an idea. |
I started an issue regarding that in #27 , from the considerations given there, I prefer to use always the full name, and used it in the examples, but that is not yet decided. |
I would not put the burden of checking for an answer onto the Coordinator, as it does not know, whether an answer is required. |
Exactly.
We are trying to specify the protocol, though, with as little as possible relying on user capability. ;-)
Yeah, but then you have different conversation IDs for different "topics", and a
I think we need to decide if we always use the full addresses or not, for this.
I fear I don't understand. If the coordinator is "dead", how can everything continue as usual? Aren't all the connections dead? It did not send heartbeats. How does the CONNECT message from a Component come into play here?
Thanks. Sure humans can do that, but thinking of pymeasure, people also leave their instrument names alone most of the time, and it will be nice if we automatically disambiguate.
👍
Elsewhere we talked about that a message always requires a reply (even if it is null) - I thought that to be the original purpose of the |
oh man, multi-parallel processing of discussion points 😓 time for dinner soon 😁 |
If a Coordinator is restarted (due to being an OS service etc.), all the Components reconnect automatically (in Zmq), without knowing, that they reconnected. If we require a new "connect" message, all Components have to take an action. |
OK, I think we are maybe talking about two different "connect" events. You are talking (afaict) about the zmq connection, which automatically gets reconnected. If the Component does not even realise that the connection was gone for a while, indeed, why would it need a new |
As mermaid diagrams are not rendered in a PR, I collect the protocol definitions here.
The Message Layer will define, how the commands are encoded, here they are in plan English.
How the Header is formatted, will be defined in #33
General notes:
Connection
address is for example protocol, host, and port.
Basic communication
basic communication (connect/disconnect, heartbeat)
Successful communication
Notes:
Different unsuccessful communication parts
Components should request a heartbeat (by sending one themselves) before the time expires.
Message exchange
Message exchange in one Coordinator
Notes:
Questions:
Message exchange with two Coordinators.
During the whole exchange, the conversation ID is the same.
The text was updated successfully, but these errors were encountered: