Skip to content

Commit

Permalink
Merge pull request #59 from MinaFoundation/pm-1958-add-spread
Browse files Browse the repository at this point in the history
PM-1958 - Implement pod spread constraint
  • Loading branch information
piotr-iohk authored Sep 6, 2024
2 parents e50754c + 664dd7a commit 91606a7
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 2 deletions.
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,12 @@ These environment variables control the program's runtime:
- `RETRY_COUNT` - Number of times a batch should be retried before giving up. Default: `3`.
- `SUBMISSION_STORAGE` - Storage where submissions are kept. Valid options: `POSTGRES` or `CASSANDRA`. Default: `POSTGRES`.

### DevOps Configuration

Configuration related to infra/operations.

- `SPREAD_MAX_SKEW` - The degree of the spread of Stateless Verification workers among the nodes, see: [`maxSkew`](https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/#spread-constraint-definition). Default: `1`.

### Stateless Verification Tool Configuration

The Coordinator program runs the `stateless-verification-tool` for validation against submissions. Set the following environment variables for this purpose:
Expand Down
19 changes: 17 additions & 2 deletions uptime_service_validation/coordinator/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,8 @@ def setUpValidatorPods(time_intervals, logging, worker_image, worker_tag):
for index, mini_batch in enumerate(time_intervals):

# Job name
job_name = f"delegation-verify-{datetime.now(timezone.utc).strftime('%y-%m-%d-%H-%M')}-{index}"
job_group_name = f"delegation-verify-{datetime.now(timezone.utc).strftime('%y-%m-%d-%H-%M')}"
job_name = f"{job_group_name}-{index}"

# Define the environment variables
env_vars = [
Expand Down Expand Up @@ -232,6 +233,9 @@ def setUpValidatorPods(time_intervals, logging, worker_image, worker_tag):
volume_mounts=[auth_volume_mount],
)

pod_annotations = {"karpenter.sh/do-not-evict": "true"}
pod_labels = {"job-group-name": job_group_name}

# Create the job
job = client.V1Job(
api_version="batch/v1",
Expand All @@ -241,9 +245,20 @@ def setUpValidatorPods(time_intervals, logging, worker_image, worker_tag):
ttl_seconds_after_finished=ttl_seconds,
template=client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(
annotations={"karpenter.sh/do-not-evict": "true"}
annotations=pod_annotations,
labels=pod_labels
),
spec=client.V1PodSpec(
topology_spread_constraints=[
client.V1TopologySpreadConstraint(
max_skew=int(os.environ.get("SPREAD_MAX_SKEW", "1")),
topology_key="kubernetes.io/hostname",
when_unsatisfiable="DoNotSchedule",
label_selector=client.V1LabelSelector(
match_labels=pod_labels
),
)
],
init_containers=[init_container],
containers=[container],
restart_policy="Never",
Expand Down

0 comments on commit 91606a7

Please sign in to comment.