Middleware for Kubernetes Node Scaling and Stress Testing

Overview

This middleware is designed to facilitate dynamic scaling of nodes in a Kubernetes cluster and stress testing individual nodes, when instructed by the controller. The primary functionalities include:

Cluster Scaling:
- Start Node: Add a new node to the cluster.
- Delete Node: Remove an existing node from the cluster.
- Node Count: Get the number of currently running nodes.
Node Scaling:
- Spin Up Pods: Launch stress-ng pods on specific nodes with customizable stress parameters.
- Retrieve CPU Usage: Obtain CPU usage percentage for each node.
- Pod count: Return running pod count for a particular node.
- Delete pods: Deletes completed or failed pods.

Middleware Implementation and brief function description

Middleware is implemented as a REST API to bridge the gap between the controller and the cluster. Some of the functions that middleware implements are scaling up and scaling down via the spin_up_pod and delete_pod method. In order to run the middleware just type the following command Python3 middleware_api.py After this you can call the various endpoints using python requests -

The brief description of all the task performed by the Middleware is provided below

parse_input(input_str):
- Parses the input string containing stress-ng arguments
- Extracts argument name-value pairs using regex
- Returns a dict with the extracted arguments
get_node_capacity(node_name):
- Gets the CPU capacity in nanocores for the given node
- Runs kubectl command to get node info in JSON format
- Parses CPU capacity from the node info
- Returns CPU capacity in nanocores
spin_up_pod(args, pod_name, node_name):
- Creates a Kubernetes Pod manifest
- Configures the manifest with stress-ng container and arguments
- Assigns a unique name and specifies node to schedule pod on
- Calls Kubernetes API to create the Pod on the cluster
- Returns a message with the pod creation status
delete_pods():
- Lists all pods in "default" namespace
- Deletes any completed pods (Succeeded or Failed phase)
- Returns a list with names of deleted pods
@app.route("/cpu") / get_cpu():
- Utilizes Metric server API endpoint to get CPU usage on nodes
- Calls Kubernetes metrics API to get node usage stats
- Loops through each node
- Gets CPU usage in nanoseconds
- Calculates CPU capacity of node in nanocores
- Computes CPU usage percentage
- Adds node and usage to dictionary
- Returns JSON with dict mapping node name to CPU usage percentage
@app.route("/pod-num") / get_pod_num():
- API endpoint that when called will:
- Delete any completed pods
- List remaining pods
- Return total pod count and deleted pods
@app.route('/pod') / handle_post():
- Endpoint to create a new stress pod
- Parses input parameters
- Calls spin_up_pod() to create pod
- Returns JSON message with result
@app.route("/nodes") / get_nodes():
- Returns a list of the names of nodes currently running in the cluster (excluding the master node)
- Calls the Kubernetes API to get all nodes
- Loops through nodes and appends the name to a list if not the master node
- Returns success=True and list of node names
- Handles any errors and returns success=False with error message
evict_pods(node_name, api_instance):
- Helper function to evict all pods on a given node
- Gets all pods and filters to only those scheduled on the node
- Calls Kubernetes API to delete/evict each pod
- Logs and handles errors for each eviction
@app.route('/delete-node') / delete_node():
- Endpoint to delete a Kubernetes node for cluster scale down
- Gets node name from request payload
- Calls evict_pods() to evict all pods first
- Calls Kubernetes API to delete the node
- Returns success or error message
@app.route('/start-node') / start_node():
- Endpoint to add/start a new Kubernetes node to scale up cluster
- Gets node name from request payload
- Creates Node API object with node name
- Calls Kubernetes API to create the Node resource
- Returns success or error message

API endpoints and call to functions -

Get CPU Usage
- Endpoint: /cpu
- Method: GET
- Description: Retrieve CPU usage percentage for each node in the cluster.
- Example API Call (Python Requests):

import requests

response = requests.get('http://localhost:5001/cpu')
cpu_data = response.json()
print(cpu_data)

Spin Up Pods
- Endpoint: /pod
- Method: POST
- Description: Launch stress-ng pods on a specific node with given stress parameters.
- Example API Call (Python Requests):

import requests

payload = {
    "job": "stress-ng --io 2 --vm 8 --vm-bytes 4G --timeout 2m",
    "name": "example-pod",
    "node": "node1.group-3-project.ufl-eel6871-fa23-pg0.utah.cloudlab.us"
}

response = requests.post('http://localhost:5001/pod', json=payload)
result = response.json()
print(result)

Delete Pods
- Endpoint: /delete-pods
- Method: GET
- Description: Initiate deletion of completed/failed pods.
- Example API Call (Python Requests):

import requests

response = requests.get('http://localhost:5001/delete-pods')
deleted_pods = response.json()
print(deleted_pods)

Get Pod Count on a Node
- Endpoint: /pod-num
- Method: POST
- Description: Get the number of pods running on a specific node.
- Example API Call (Python Requests):

import requests

payload = {"node": "node1.group-3-project.ufl-eel6871-fa23-pg0.utah.cloudlab.us"}

response = requests.post('http://localhost:5001/pod-num', json=payload)
pod_info = response.json()
print(pod_info)

Get Nodes
- Endpoint: /nodes
- Method: GET
- Description: Retrieve the list of nodes currently running in the cluster.
- Example API Call (Python Requests):

import requests

response = requests.get('http://localhost:5001/nodes')
nodes_info = response.json()
print(nodes_info)

Delete Node
- Endpoint: /delete-node
- Method: POST
- Description: Remove a node from the cluster for scaling down.
- Example API Call (Python Requests):

import requests

payload = {"node": "node1.group-3-project.ufl-eel6871-fa23-pg0.utah.cloudlab.us"}

response = requests.post('http://localhost:5001/delete-node', json=payload)
result = response.json()
print(result)

Start Node
- Endpoint: /start-node
- Method: POST
- Description: Add a new node to the cluster for scaling up.
- Example API Call (Python Requests):

import requests

payload = {"node": "node1.group-3-project.ufl-eel6871-fa23-pg0.utah.cloudlab.us"}

response = requests.post('http://localhost:5001/start-node', json=payload)
result = response.json()
print(result)

RUN

python3 middleware_api.py

This will make api go live on http://128.110.217.71:5001 and http://127.0.0.1:5001

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.gitignore		.gitignore
Cpu-metrics-server.py		Cpu-metrics-server.py
README.md		README.md
api_test.py		api_test.py
middleware_api.py		middleware_api.py
mock-api.py		mock-api.py
out.log		out.log
pod-num-error.out.log		pod-num-error.out.log
pod-starter.py		pod-starter.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Middleware for Kubernetes Node Scaling and Stress Testing

Overview

Middleware Implementation and brief function description

The brief description of all the task performed by the Middleware is provided below

API endpoints and call to functions -

RUN

About

Releases

Packages

Contributors 5

Languages

EEL6871-Group3/middleware

Folders and files

Latest commit

History

Repository files navigation

Middleware for Kubernetes Node Scaling and Stress Testing

Overview

Middleware Implementation and brief function description

The brief description of all the task performed by the Middleware is provided below

API endpoints and call to functions -

RUN

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages