Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test for checking race-condition between ARP Learning and L2 Events happening #4280

Open
abdosi opened this issue Sep 16, 2021 · 3 comments · May be fixed by #17267
Open

Test for checking race-condition between ARP Learning and L2 Events happening #4280

abdosi opened this issue Sep 16, 2021 · 3 comments · May be fixed by #17267

Comments

@abdosi
Copy link
Contributor

abdosi commented Sep 16, 2021

Issue#
On some of SAI implementation we have seen race condition between processing ARP Reply and FDB which can cause resource leak of Nexthop in ASIC. One of scenario that can happen and need to be verifed:

a)Create of Neighbor Entry N1 for which Mac is not learnt . This will create Host Entry point to Drop Interface. One way to simulate this is by (sending ARP Reply with Multicast Source MAC)

b)Send some generic L2 Data Traffic so that Mac Learning happening so that SAI L2 thread is kept busy processing those FDB Events

c)Now create Neighbor Entry N1 entry with for which MAC will be learnt (Again Send Arp Reply)

After doing (c) first neighbor set entry gets processed and because of our artificial delay cause (b) Mac Event is processed later and this result a scenario where we crerate extra Nexthop/Neighbor resource in ASIC and does create Leak.

Another way where L2 Thread can be busy is if lot of Aging events are happening.

Ideally for ARP packet should get MAC Event processing should happen first and then Nexighbor Event processing should happen but because of race-condition between 2 thread order can change.

Expectation from test case is to try simulate this race-condition and make sure things are fine irrespective of order in which events are processed.

@yxieca yxieca added the P1 label Oct 20, 2021
@yxieca
Copy link
Collaborator

yxieca commented Oct 20, 2021

Challenge: how can we make this generic?

@StormLiangMS
Copy link
Collaborator

hi @prsunny this one seems a dataplane related test gap, assigned to you for further triage.

@theasianpianist theasianpianist linked a pull request Mar 1, 2025 that will close this issue
11 tasks
@theasianpianist
Copy link
Contributor

Ref CS00011581499

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants