-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flush VM flows after removal #369
Conversation
24023df
to
36f7fb7
Compare
@PlagueCZ |
36f7fb7
to
dd58995
Compare
dd58995
to
8832275
Compare
@guvenc I added the condition for VNI+IP in the flow table. It works with only one IP (src or dst) as they are connected in the flow entry, but I am unsure that this is true in all situations and since this is not a hot path, I simply check both. What I am unsure now however (and missed previously) is the fact that we are deleting from the hash table (via the So I think I should create a followup PR that changes all these freeup functions to only note the flow pointer down and then free them up at the end? And unless I do two passes to get the number of flows to free, I guess I would need to |
src/dp_flow.c
Outdated
return; | ||
} | ||
if (flow_val->created_port_id == port_id | ||
|| (next_key->vni == vni && (next_key->ip_src == ipv4 || next_key->ip_dst == ipv4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The better way to do this is:
(flow_val->created_port_id == port_id || (next_key->vni == vni && flow_val->flow_key[DP_FLOW_DIR_ORG].ip_dst == ipv4))
Because you only want to delete the traffic coming from outside (PF) and targeting this VM's ip.
next_key->ip_src == ipv4
is already covered with flow_val->created_port_id == port_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added both "to be sure", since does indeed work in practice, but only because removing a flow automatically removes the other direction, right? Just so we are on the same page.
Because if I do a TCP connection VM1->VM2 and then remove VM2, this freeup function wants to keep IP2->IP1 and only removes it because while removing IP1->IP2 it removes both directions (confirmed by pytest).
If this is 100% to be always true, then I can remove the SRC condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes removing one key will also remove the other key as well. If you follow the call chain of dp_age_out_flow()
, you will see this code snippet:
dp_delete_flow_key_no_flush(&cntrack->flow_key[DP_FLOW_DIR_ORG]);
dp_delete_flow_key_no_flush(&cntrack->flow_key[DP_FLOW_DIR_REPLY]);
Our conntrack table is designed in a such way that both keys to same memory location and that memory
location holds the both keys pointing to it and flow_val also knows that the first key in its flow_key array
(DIR_ORG) is the first flow seen and triggered the creation of this flow.
key (5 Tuple hash) ---------------------------->|-----> flow_val
inversed key (Inversed 5 Tuple hash) -------^
Hopefully the ASCII art is understandable :) So we always need to remove 2 keys and and one flow_val for each "flow" we installed.
@PlagueCZ |
Now thinking about it I think I actually encountered this in virtual services, where I previously was doing iteration and free and there were situations when it truly did not free up everything. But it only happened for me when the table was populated and I removed all entries. |
I found an interesting query that actually post the experiment result on the question "delete while iterating" [1]. It seems that we need to address this issue as his experiment result shows that due to the use of two buckets, keys can be leaked. [1] https://www.mail-archive.com/[email protected]/msg163820.html |
8832275
to
880c568
Compare
Related to issue #357, removing a VM now cleans-up it's tracked flows. This did not produce any problems, but since the mechanism is already in-place for other situations, it seems cleaner to do it.
The only drawback could be that removing a live VM and adding it back could have some connections break down, but this is indistinguishable from a situation where the newly connected VM is a different one (thus needing the cleanup).
There is no good way to add a pytest for this, but I did have a test while developing (this is the cause of improving the TCP tester script).