Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Degraded node network throughput when retin is installed #655

Open
grzesuav opened this issue Aug 26, 2024 · 4 comments
Open

Degraded node network throughput when retin is installed #655

grzesuav opened this issue Aug 26, 2024 · 4 comments
Assignees
Labels

Comments

@grzesuav
Copy link

Describe the bug
Along with 1.29 AKS upgrade, retina agens was installed on our nodes, which resulted in degraded network throughput, details in Azure/AKS#4508

To Reproduce
See related issue - Azure/AKS#4508
Expected behavior
Network throughput not impacted by retina

Screenshots
See related issue - Azure/AKS#4508

Platform (please complete the following information):

  • OS: [AKSUbuntu]
  • Kubernetes Version: [1.29/1.30]
  • Host: AKS
  • Retina Version:

Additional context
Add any other context about the problem here.

@nddq nddq added the priority/0 P0 label Aug 26, 2024
@anubhabMajumdar anubhabMajumdar self-assigned this Aug 26, 2024
@vakalapa
Copy link
Contributor

@grzesuav thanks for reporting the issue, can you give me some more information on if the perf degrade is within same Node, or with traffic between different nodes? We know that intra node has some affect with eBPF programs as there is no noise and it even a small ebpf prog can affect the line rate.

If the perf degrade you saw was in INTER node communication, we can run some tests based on criteria you provide. We can then root cause it to which one or more eBPF progs could be causing this issue.

@grzesuav
Copy link
Author

@vakalapa as far I can tell it was between inter node and Azure (blob storage) - however I cannot say fo 100%. We have some internal S3 like app which is using azure blob as backend. The graphs from Azure/AKS#4508 shows network throughput of this app - not sure what more I can provide - as you can see there the overall throughput is very high

@grzesuav
Copy link
Author

hi, is there any update on this ?

@vakalapa
Copy link
Contributor

vakalapa commented Oct 3, 2024

We are working on the performance pipeline for public test results. We were still unable to repro the issue, @ritwikranjan plz tie this issue to your performance pipeline work ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Accepted
Development

No branches or pull requests

4 participants