-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Acquire neighboring router MAC dynamically #619
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some documentation related request. Details in the review comment. Thanks !
@@ -7,6 +7,7 @@ | |||
| --pf0 | IFNAME | first physical interface (e.g. eth0) | | | |||
| --pf1 | IFNAME | second physical interface (e.g. eth1) | | | |||
| --pf1-proxy | IFNAME | VF representor to use as a proxy for pf1 packets | | | |||
| --pf1-proxy-vf | IFNAME | VF interface of the pf1-proxy VF representor | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have some combinations which do not make sense with the newly introduced flags ?
Like, if I set pf1-proxy
then I need to set pf1-proxy-vf
as well ? and If I set multiport-eswitch, I would needpf1-proxy-*
switches and if they are not set then I can not operate meaningfully ?
If so, can we document these dependencies ? and enforce during command line argument parsing with meaningful hints returned ?
Also in the documentation, maybe an example dpservice-bin command including all the command line parameters for a simple mpesw operation ?
Or maybe I overlooked it in the documentation ? Otherwise it is not so clear now, how to make mpesw work and which parameters are needed to make it work. At least to me.
These command line arguments are supposed to be generated by prepare.sh
right ? but still showcasing in an example which parameters needed for dpservice-bin to operate successfully in mpesw mode, would be helpful and how (with which parameters) am I supposed to call prepare.sh if I want to operate in mpesw mode ?
0c48b9c
to
77bb2f1
Compare
Signed-off-by: Guvenc Gulce <[email protected]>
Signed-off-by: Guvenc Gulce <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see comment details.
src/monitoring/dp_event.c
Outdated
@@ -88,6 +85,12 @@ void dp_process_event_link_msg(struct rte_mbuf *m) | |||
} | |||
|
|||
port->link_status = status; | |||
DPS_LOG_INFO("PF link state changed", DP_LOG_LINKSTATE(port->link_status), DP_LOG_PORT(port)); | |||
|
|||
if (status == RTE_ETH_LINK_UP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move this part to dp_link_status_change_event_callback
. For me this looks like an unnecessary round trip from worker thread back to main thread by calling these functions here and the worker thread directly calls the netlink call which is not so nice as the worker thread should minimize the work which doesnt include packet processing.
Another problem with this approach is that worker thread now will enqueue to the monitoring queue which is actually supposed to be enqueued only by the main thread.
This approach will worsen the problem we already have in the code. monitoring_queue is being enqueued by main thread and by the interrupt thread (dp_link_status_change_event_callback) at the moment which is already a problem. Now we bring a third thread (worker thread) which will enqueue to the monitoring queue and monitoring queue is created as single producer, single consumer actually.
So in a nutshell:
- During init of the PF ports we can arm the single call timer run on main thread which will call netlink, re-arm if netlink fails, send mac_neigh event with mac address if netlink doesnt fail. This is already done this way.
- And in
dp_link_status_change_event_callback
we can arm the single call timer run on main thread if the link state is reported as up and let the timer callback call netlink, re-arm if netlink fails, send mac_neigh event with mac address if netlink doesnt fail. So that we ensure the enqueue of mac_neigh event happens in main thread context. - And also in
dp_link_status_change_event_callback
stop the the timer, if the link state gets reported as "down".
with this approach, we wouldnt at least worsen the situation in current code. Interrupt thread enqueuing together with main thread to monitoring queue can be addressed separately as it was not caused by this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the indicated code away from worker thread to the "interrupt" thread (the one where timer fires).
I think this should be enough, but if we only ever want to call the MAC retrieval code from one thread (the interrupt thread) and not the main thread, it would require removing the ability to "get the MAC immediately" in the init phase, thus always waiting at least a second for it. Which I do not believe is needed as the real problem here was the worker doing unnecessary stuff, which is fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that the change is sufficient for this PR.
Currently dpservice acquires neighboring router MAC addresses during initialization.
Not only this is a problem if there is a port replaced on a switch (thus the MAC changes), but in multiport-eswitch with pf1-proxy, the MAC is taken from the proxy port. But without dpservice running, there are no neigbors registered (because the proxy interface is effectively dead at that time), so the MAC needs to be loaded dynamically after a while.
Thus I added a mechanism that in case of a missing neighboring router MAC, a timer is started and the retrieval is retried again later.
The timer has an exponential backoff, starting with 1s and doubled each time, until it reaches 60s.
This PR is best read commit-by commit, since that shows the evolution of all the changes.
Connected to #606
There was also a need for the init container to fail when some of the values for
dp_service.conf
were not available. I lumped it together here as there were already changes to the prepare script, but this can be separated if needed.