Skip to content

Add manager for simple Kubernetes job creation #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 87 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
de44dea
add manager for simple kubernetes job creation
iripiri Sep 23, 2021
8959757
suppress state changes from event repetitions
iripiri Sep 24, 2021
d69a26f
try create of superclass if parameters are missing
iripiri Sep 24, 2021
298df49
delete configmap when job is deleted
iripiri Sep 28, 2021
f13ea00
fix api_url for villas-node manager
stv0g Oct 1, 2021
df28149
simplify config handling
stv0g Oct 8, 2021
e0d569c
move global status to Controller mixin class
stv0g Oct 8, 2021
a0759cf
make work dir configurable
stv0g Oct 8, 2021
a9bde8d
pass payload instead of message to action handlers
stv0g Oct 8, 2021
1f30e93
fix mixed up state vs status properties in some components
stv0g Oct 8, 2021
4e23053
fix mixed up state vs status properties in some components
stv0g Oct 8, 2021
426cbf2
fix mixed up state vs status properties in some components
stv0g Oct 8, 2021
22ba436
added first version of an HTTP REST API
stv0g Oct 8, 2021
4360d29
creating AMQP queue with durable, exclusive and auto_delete flags
stv0g Oct 8, 2021
124f55f
add first draft of OpenAPI spec
stv0g Oct 8, 2021
5800d0b
fix simple manager
iripiri Oct 14, 2021
1f695f5
use 'villas' namespace as default
iripiri Nov 17, 2021
17080eb
use villas-controller namespace as default
iripiri Nov 17, 2021
d705699
move schemas to YAML files and load them from there for all components
stv0g Oct 12, 2021
208f8e8
config: show a useful error message when started without broker param…
stv0g Oct 12, 2021
71a4872
fix typo
stv0g Oct 12, 2021
075fbfc
config: fix bug when started without config file
stv0g Oct 12, 2021
aa6432c
validate action parameters against schema
stv0g Oct 12, 2021
b79c286
fix linting errors
stv0g Oct 13, 2021
bac3246
fix some errors in the API spec and schema
stv0g Oct 13, 2021
38c698c
validate schemas and openapi doc against meta jsonschemas
stv0g Oct 14, 2021
a27be13
raise a SimulationException if an IC with an existing UUID should be …
stv0g Oct 22, 2021
f87608c
provide version number within status field of status update
stv0g Oct 22, 2021
e0f7829
relay: do not remove vanished sessions by setting state to gone
stv0g Oct 22, 2021
f789439
improve error reporting
stv0g Oct 22, 2021
c0dcdad
add labels and annotations to created job resources
stv0g Oct 22, 2021
37634bd
k8s: add owner references
stv0g Oct 22, 2021
250c97b
make all manually configured components to be managed by the default …
stv0g Oct 22, 2021
b867d02
use a api/v1 prefix for the API handlers
stv0g Oct 22, 2021
061d629
fixes for schema & pod_uid
iripiri Oct 28, 2021
7a7f273
Change update interval
iripiri Nov 2, 2021
3093451
generic: move return code to status section
stv0g Nov 16, 2021
d298eba
fix exception in schema code
stv0g Nov 16, 2021
328249d
kubernetes fixes
iripiri Nov 16, 2021
31699d1
move schema to separate file
iripiri Nov 17, 2021
dcc4a65
debugging
iripiri Nov 18, 2021
5ed46b7
catch timeouterror
iripiri Nov 18, 2021
5c615cb
show event outputs
iripiri Nov 18, 2021
f30366c
debugging
iripiri Nov 18, 2021
b7de56f
add init file for kubernetes-simple schema
iripiri Nov 19, 2021
8837d2b
get kubernetes job running
iripiri Nov 19, 2021
5e925f0
fixes
iripiri Nov 22, 2021
4ff0014
allow UUID of default generic manager to be configured via configurat…
stv0g Dec 7, 2021
f82f994
fix villas-node manager
stv0g Dec 7, 2021
ff5ca02
api: make main request handler also available without trailing slash
stv0g Dec 7, 2021
e2c6631
more fixes for relay and node managers
stv0g Dec 7, 2021
ea5497d
cleanup
iripiri Dec 8, 2021
791af64
send status update while resetting to improve user experience
iripiri Dec 17, 2021
1c962d8
publish first status update immediately after creating component
iripiri Dec 17, 2021
7f0eac0
read namespace for kubernetes jobs from ENV
iripiri Dec 17, 2021
21af994
fix formatting
iripiri Jan 12, 2022
63c3b22
cleanup
iripiri Feb 11, 2022
d3779d1
fix 'invalid UUID length: 0' error
iripiri Mar 3, 2022
2007bf8
fix error handling relay/node
iripiri Jan 12, 2022
85d83e0
relay fix
iripiri Mar 4, 2022
3d7791b
container settings
iripiri Mar 9, 2022
723f052
update flake repo
iripiri Aug 14, 2023
6c6368f
formatting
iripiri Aug 14, 2023
76ea53c
remove namespace checking to get rid of ClusterRole neccessity
iripiri Mar 17, 2025
dbde680
Merge branch 'master' into simple-kubernetes-manager
iripiri Mar 17, 2025
b39fb19
fix pipeline
iripiri Mar 19, 2025
e47a5e1
make securitycontext configurable through schema
iripiri Mar 20, 2025
fe1580b
Sync before executing setup.py so that controller module will be found
iripiri Mar 20, 2025
1174fde
handle ModuleNotFoundError
iripiri Mar 20, 2025
6bcc818
use pip to run pipeline
iripiri Mar 25, 2025
67a8761
changed python docker image due to pip3 command not found in precommi…
iripiri Mar 26, 2025
7f5242d
changed back image version and changed CI tag
iripiri Mar 26, 2025
e06754c
update schema checking
iripiri Mar 26, 2025
d0c70d1
commented schema checking to run CI, needs to get revised
iripiri Mar 26, 2025
2ccadc5
debug test fail
iripiri Mar 26, 2025
e4dd2b3
add current workind directory to PATH
iripiri Mar 28, 2025
555fe20
added init file to villas folder
iripiri Mar 28, 2025
5ac9fa1
remove changes made during pipeline debug/fix
iripiri Mar 28, 2025
52e7338
get version (fix ModuleNotFound error), remove tags
iripiri Mar 31, 2025
fd0d5c8
add long description which got lost in last commit
iripiri Mar 31, 2025
5fdb890
updated json schema checking
iripiri Apr 1, 2025
44e8da4
added blank lines (flake8)
iripiri Apr 1, 2025
05fdaa4
add init.py for modules to be found in tests
iripiri Apr 1, 2025
dd13225
merged with pipeline fixes
iripiri Apr 1, 2025
64f2143
watch the resources inside the controller namespace
iripiri Apr 2, 2025
79a56d6
start watcher thread AFTER reading environment variables
iripiri Apr 2, 2025
3df9f3d
corrected setting the namespace
iripiri Apr 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@ build:precommit:
- pip3 install -r requirements-dev.txt
script:
- pre-commit run --all-files
tags:
- docker

build:test:
stage: build
Expand All @@ -33,8 +31,6 @@ build:test:
- pip3 install -r requirements-dev.txt
script:
- pytest -v
tags:
- docker

build:dist:
stage: build
Expand Down
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ repos:
rev: 6.0.0
hooks:
- id: flake8
- repo: https://github.com/sirosen/check-jsonschema
rev: 0.22.0
- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.32.1
hooks:
- id: check-jsonschema
name: "Check schemas"
Expand All @@ -32,4 +32,4 @@ repos:
language: python
files: ^doc/openapi.yaml$
types: [yaml]
args: ["--schemafile", "https://raw.githubusercontent.com/OAI/OpenAPI-Specification/main/schemas/v3.1/schema.json"]
args: ["--schemafile", "https://spec.openapis.org/oas/3.1/schema/2025-02-13"]
14 changes: 14 additions & 0 deletions etc/config_simplekub.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
broker:
url: amqp://villas:Haegiethu0rohtee@kubernetes-master-1.os-cloud.eonerc.rwth-aachen.de:30809/%2F

components:
- type: generic
category: manager
uuid: ebbbbba0-557b-4848-ac7a-faa3e7c51fa3

- category: manager
type: kubernetes-simple
uuid: 4bbbb73e-7e74-11eb-8f63-f3a5b3ab82f6

namespace: villas-controller
4 changes: 2 additions & 2 deletions etc/params_k8s_dpsim.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ properties:
name: dpsim
spec:
suspend: true
activeDeadlineSeconds: 120 # kill the Job after 1h
activeDeadlineSeconds: 3600 # kill the Job after 1h
backoffLimit: 0 # only try to run pod once, no retries
ttlSecondsAfterFinished: 120 # delete the Job resources 1h after completion
ttlSecondsAfterFinished: 3600 # delete the Job resources 1h after completion
template:
spec:
restartPolicy: Never
Expand Down
22 changes: 19 additions & 3 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,30 @@
from setuptools import setup, find_namespace_packages
from glob import glob

from villas.controller import __version__ as version
import os
import re


def get_version():
here = os.path.abspath(os.path.dirname(__file__))
init_file = os.path.join(here, "villas", "controller", "__init__.py")

with open(init_file, "r") as f:
content = f.read()

match = re.search(r"^__version__ = ['\"]([^'\"]*)['\"]", content, re.M)
if match:
return match.group(1)

raise RuntimeError("Version not found")


with open('README.md') as f:
long_description = f.read()

setup(
name='villas-controller',
version=version,
version=get_version(),
description='A controller/orchestration API for real-time '
'power system simulators',
long_description=long_description,
Expand All @@ -20,7 +36,7 @@
keywords='simulation controller villas',
classifiers=[
'Development Status :: 3 - Alpha',
'License :: OSI Approved :: Apache Software License'
'License :: OSI Approved :: Apache Software License',
'Programming Language :: Python :: 3'
],
packages=find_namespace_packages(include=['villas.*']),
Expand Down
Empty file added villas/__init__.py
Empty file.
11 changes: 9 additions & 2 deletions villas/controller/component.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,11 @@ def load_schema(self):
fo = resources.open_text(pkg, res)
loadedschema = yaml.load(fo, yaml.SafeLoader)

schema[name] = Draft202012Validator(loadedschema)
try:
Draft202012Validator.check_schema(loadedschema)
schema[name] = loadedschema
except jsonschema.exceptions.SchemaError:
self.logger.warning("Schema is invalid!")

return schema

Expand Down Expand Up @@ -277,12 +281,15 @@ def from_dict(dict):

def publish_status(self):
if not self.mixin:
self.logger.warn('No mixin!')
return

self.mixin.publish(self.status, headers=self.headers)

def publish_status_periodically(self):
self.logger.info('Start state publish thread')
self.logger.info('Start state publish thread, initial status: %s',
self.status)
self.publish_status() # publish the first update immediately

while not self.publish_status_thread_stop.wait(
self.publish_status_interval):
Expand Down
4 changes: 3 additions & 1 deletion villas/controller/components/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ def from_dict(dict):
if type == 'kubernetes':
from villas.controller.components.managers import kubernetes
return kubernetes.KubernetesManager(**dict)
if type == 'kubernetes-simple':
from villas.controller.components.managers import kubernetes_simple
return kubernetes_simple.KubernetesManagerSimple(**dict)
if type == 'villas-node':
from villas.controller.components.managers import villas_node # noqa E501
return villas_node.VILLASnodeManager(**dict)
Expand All @@ -43,7 +46,6 @@ def from_dict(dict):
def add_component(self, comp):
if comp.uuid in self.mixin.components:
existing_comp = self.mixin.components[comp.uuid]

raise SimulationException(self, 'Component with same UUID ' +
'already exists!',
component=existing_comp)
Expand Down
44 changes: 21 additions & 23 deletions villas/controller/components/managers/kubernetes.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,42 +22,36 @@ class KubernetesManager(Manager):
def __init__(self, **args):
super().__init__(**args)

self.thread_stop = threading.Event()

self.pod_watcher_thread = threading.Thread(
target=self._run_pod_watcher)
self.job_watcher_thread = threading.Thread(
target=self._run_job_watcher)
self.event_watcher_thread = threading.Thread(
target=self._run_event_watcher)

if os.environ.get('KUBECONFIG'):
k8s.config.load_kube_config()
else:
k8s.config.load_incluster_config()

self.namespace = args.get('namespace', 'default')
# the namespace in which to create the jobs
# and to watch for events
self.namespace = os.environ.get('NAMESPACE')
self.namespace = ''.join([self.namespace, '-controller'])

self.my_namespace = os.environ.get('NAMESPACE')
# name and UID of the pod in which this controller is running
# used in kubernetes simulator to set the owner reference
self.my_pod_name = os.environ.get('POD_NAME')
self.my_pod_uid = os.environ.get('POD_UID')

self._check_namespace(self.namespace)
self.thread_stop = threading.Event()

self.pod_watcher_thread = threading.Thread(
target=self._run_pod_watcher)
self.job_watcher_thread = threading.Thread(
target=self._run_job_watcher)
self.event_watcher_thread = threading.Thread(
target=self._run_event_watcher)

# self.pod_watcher_thread.start()
# self.job_watcher_thread.start()
self.event_watcher_thread.setDaemon(True)
self.event_watcher_thread.start()

def _check_namespace(self, ns):
c = k8s.client.CoreV1Api()

namespaces = c.list_namespace()
for namespace in namespaces.items:
if namespace.metadata.name == ns:
return

raise RuntimeError(f'Namespace {ns} does not exist')
# Not used yet, can support more complex logic
# self.pod_watcher_thread.start()
# self.job_watcher_thread.start()

def _run_pod_watcher(self):
w = k8s.watch.Watch()
Expand Down Expand Up @@ -107,6 +101,10 @@ def _run_event_watcher(self):

if _match(comp.job.metadata.name,
eo.involved_object.name):
if comp._state == 'stopping':
# incoming events are old repetitions
continue

if eo.reason == 'Completed':
comp.change_state('stopping', True)
elif eo.reason == 'Started':
Expand Down
82 changes: 82 additions & 0 deletions villas/controller/components/managers/kubernetes_simple.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
from villas.controller.components.managers.kubernetes import KubernetesManager
from villas.controller.components.simulators.kubernetes import KubernetesJob

parameters_simple = {
'type': 'kubernetes',
'category': 'simulator',
'uuid': None,
'name': '',
'properties': {
'job': {
'apiVersion': 'batch/v1',
'kind': 'Job',
'metadata': {
'name': ''
},
'spec': {
'activeDeadlineSeconds': 3600,
'backoffLimit': 2,
'template': {
'spec': {
'restartPolicy': 'Never',
'containers': [
{
'image': '',
'imagePullPolicy': 'Always',
'name': 'jobcontainer',
'securityContext': {
'privileged': False
}
}
]
}
}
}
}
}
}


class KubernetesManagerSimple(KubernetesManager):

def __init__(self, **args):
super().__init__(**args)

def create(self, payload):
params = payload.get('parameters', {})
sim_name = payload.get('name', 'Kubernetes Simulator')
jobname = params.get('jobname', 'noname')
adls = params.get('activeDeadlineSeconds', 3600)
if type(adls) is str:
adls = int(adls)
image = params.get('image')
name = params.get('name')
privileged = params.get('privileged', False)
uuid = params.get('uuid')
self.logger.info('uuid:')
self.logger.info(uuid)

if image is None:
self.logger.error('No image given, will try super.create')
super().create(payload)
return

parameters = parameters_simple
parameters['name'] = sim_name
job = parameters['properties']['job']
job['metadata']['name'] = jobname
job['spec']['activeDeadlineSeconds'] = adls
job_container = job['spec']['template']['spec']['containers'][0]
job_container['image'] = image
job_container['securityContext']['privileged'] = privileged

parameters['job'] = job

if name:
parameters['name'] = name

if uuid:
parameters['uuid'] = uuid

comp = KubernetesJob(self, **parameters)
self.add_component(comp)
12 changes: 10 additions & 2 deletions villas/controller/components/simulators/kubernetes.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
from typing import TYPE_CHECKING
import json
import signal
import time

import kubernetes as k8s

Expand All @@ -28,6 +29,7 @@ def __init__(self, manager: KubernetesManager, **args):

self.job = None
self.pods = set()
self.cm_name = ''

self.custom_schema = props.get('schema', {})

Expand Down Expand Up @@ -65,6 +67,7 @@ def _owner(self):
def _prepare_job(self, job, payload):
# Create config map
cm = self._create_config_map(payload)
self.cm_name = cm.metadata.name

# Create volumes
v = k8s.client.V1Volume(
Expand Down Expand Up @@ -173,6 +176,9 @@ def _delete_job(self):
self.job = None
self.properties['job_name'] = None
self.properties['pod_names'] = []
# job isn't immediately deleted
# let the user see that something is happening
time.sleep(7)

def start(self, payload):
# Delete prior job
Expand All @@ -194,9 +200,9 @@ def start(self, payload):
self.properties['job_name'] = self.job.metadata.name
self.properties['namespace'] = self.manager.namespace

def stop(self, payload):
def stop(self, message):
self.change_state('stopping', True)
self._delete_job()

self.change_state('idle')

def _send_signal(self, sig):
Expand Down Expand Up @@ -227,6 +233,8 @@ def resume(self, payload):
self.change_state('running')

def reset(self, payload):
self.change_state('resetting', True)
self.mixin.drain_publish_queue()
self._delete_job()
super().reset(payload)

Expand Down
6 changes: 4 additions & 2 deletions villas/controller/controller.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def add_managers(self):
def publish(self, body, **kwargs):
self.publish_queue.put((body, kwargs))

def _drain_publish_queue(self):
def drain_publish_queue(self):
try:
while msg := self.publish_queue.get(False):
body = msg[0]
Expand All @@ -84,10 +84,12 @@ def _drain_publish_queue(self):
self.producer.publish(body, **kwargs)
except queue.Empty:
pass
except TimeoutError:
LOGGER.warn('TimeoutError, let kombu reconnect..')

def on_iteration(self):
# Drain publish queue
self._drain_publish_queue()
self.drain_publish_queue()

# Update components
added = self.components.keys() - self.active_components.keys()
Expand Down
Empty file.
Loading