Skip to content

Add E2E test for Locality Load Balancing #1277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

ravjot07
Copy link
Contributor

@ravjot07 ravjot07 commented Mar 14, 2025

Overview

This PR adds an end-to-end integration test (TestLocalityLoadBalancing) for the Locality Load Balancing feature in Kmesh. The goal is to validate the PreferClose traffic distribution policy by simulating a real-world scenario with services deployed in different zones. The test ensures that:

  • Traffic from a local client is served by the closest available service instance (based on node locality).

  • When the closest instance becomes unavailable, traffic is gracefully routed to a remote instance.


Test Flow

Namespace Creation

A new namespace sample is created for the test resources.

Service Deployment

The helloworld service is deployed with:

trafficDistribution: PreferClose

Backend Deployments

Two deployments of the helloworld app are launched:

  • Local Instance: On node kmesh-testing-worker, labeled as region.zone1.subzone1.

  • Remote Instance: On node kmesh-testing-control-plane, labeled as region.zone1.subzone2.

Each instance responds with its own SERVICE_VERSION.

Sleep Client Deployment

A curl-based client (sleep) is deployed on kmesh-testing-worker to simulate client-side traffic origin.


Verification

  1. Resolution Check: Uses nslookup to resolve the FQDN of the service from within the client pod.

  2. Initial Routing: Makes curl requests to helloworld.sample.svc.cluster.local expecting responses from the local instance (region.zone1.subzone1).

  3. Failover Simulation: Deletes the local deployment and retries the curl until a response is returned from the remote instance (region.zone1.subzone2).


Cleanup

The test framework handles namespace and deployment cleanup automatically after test execution.

Contributes toward #1146

@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign lec-bit for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: ravjot07 <[email protected]>
Signed-off-by: ravjot07 <[email protected]>
Copy link

codecov bot commented Mar 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 45.48%. Comparing base (e5ead63) to head (7035aaa).
Report is 6 commits behind head on main.

see 2 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7fc5794...7035aaa. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ports:
- containerPort: 5000
nodeSelector:
kubernetes.io/hostname: ambient-worker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only have two nodes in E2E cluster: kmesh-testing-control-plane and kmesh-testing-worker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add two worker nodes, otherwise it may need to set pod tolerations to allow schedule on master node

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only have two nodes in E2E cluster: kmesh-testing-control-plane and kmesh-testing-worker.

Yup, i have updated it accordingly. I missed it initially.......

@ravjot07 ravjot07 changed the title Add E2E test for Locality Load Balancing (PreferClose mode) Add E2E test for Locality Load Balancing Mar 28, 2025
@hzxuzhonghu
Copy link
Member

/retest


// applyManifest writes the provided manifest into a temporary file and applies it using kubectl.
func applyManifest(ns, manifest string) error {
tmpFile, err := os.CreateTemp("", "manifest-*.yaml")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change manifest-*.yaml

tmpFile.Close()
return err
}
tmpFile.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move before L38, actually we can make both close and remove called within a defer function

}
tmpFile.Close()

cmd := "kubectl apply -n " + ns + " -f " + tmpFile.Name()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YaoZengzeng please point which pkg to use instead of via kubectl

if err != nil || sleepPod == "" {
t.Fatalf("Failed to get sleep pod: %v", err)
}
nslookup, _ := shell.Execute(true, "kubectl exec -n "+ns+" "+sleepPod+" -- nslookup "+fqdn)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You donot need to nslookup the ip addr

return false
}
t.Logf("Curl output: %s", out)
if strings.Contains(out, "region.zone1.subzone1") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not right, here you return after checking only region.zone1.subzone1 response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants