Skip to content

Commit a06b8b5

Browse files
authored
Merge pull request #3169 from yuvipanda/k3s
Add a 2i2c federation member on Hetzner
2 parents 3d8a630 + c4402bc commit a06b8b5

19 files changed

+439
-23
lines changed

.github/workflows/cd.yml

+5
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,11 @@ jobs:
222222
helm_version: ""
223223
experimental: false
224224

225+
- federation_member: hetzner-2i2c
226+
chartpress_args: ""
227+
helm_version: ""
228+
experimental: false
229+
225230
# OVH deployment paused
226231
# - federation_member: ovh2
227232
# helm_version: ""

WISDOM.md

+1
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,4 @@
22

33
- When you are in an outage, focus only on fixing the outage - do not try to do anything else.
44
- Prefer minor annoyances happening infrequently but at regular intervals, rather than major annoyances happening rarely but at unpredictable intervals.
5+
- Sometimes, surviving is winning.

config/hetzner-2i2c.yaml

+137
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
projectName: hetzner-2i2c
2+
3+
registry:
4+
enabled: true
5+
config:
6+
storage:
7+
# Uncomment this and comment out the s3 config to use filesystem
8+
# filesystem:
9+
# rootdirectory: /var/lib/registry
10+
s3:
11+
regionendpoint: https://fsn1.your-objectstorage.com
12+
bucket: mybinder-2i2c-registry-hetzner
13+
region: does-not-matter
14+
storage:
15+
filesystem:
16+
storageClassName: "local-path"
17+
ingress:
18+
hosts:
19+
- registry.2i2c.mybinder.org
20+
21+
cryptnono:
22+
detectors:
23+
monero:
24+
enabled: false
25+
26+
binderhub:
27+
config:
28+
BinderHub:
29+
hub_url: https://hub.2i2c.mybinder.org
30+
badge_base_url: https://mybinder.org
31+
sticky_builds: true
32+
image_prefix: registry.2i2c.mybinder.org/i-
33+
# image_prefix: quay.io/mybinder-hetzner-2i2c/image-
34+
# build_docker_host: /var/run/dind/docker.sock
35+
# TODO: we should have CPU requests, too
36+
# use this to limit the number of builds per node
37+
# complicated: dind memory request + KubernetesBuildExecutor.memory_request * builds_per_node ~= node memory
38+
KubernetesBuildExecutor:
39+
memory_request: "2G"
40+
docker_host: /var/run/dind/docker.sock
41+
42+
LaunchQuota:
43+
total_quota: 300
44+
45+
# DockerRegistry:
46+
# token_url: "https://2lmrrh8f.gra7.container-registry.ovh.net/service/token?service=harbor-registry"
47+
48+
replicas: 1
49+
50+
extraVolumes:
51+
- name: secrets
52+
secret:
53+
secretName: events-archiver-secrets
54+
extraVolumeMounts:
55+
- name: secrets
56+
mountPath: /secrets
57+
readOnly: true
58+
extraEnv:
59+
GOOGLE_APPLICATION_CREDENTIALS: /secrets/service-account.json
60+
61+
dind: {}
62+
63+
ingress:
64+
hosts:
65+
- 2i2c.mybinder.org
66+
67+
jupyterhub:
68+
# proxy:
69+
# chp:
70+
# resources:
71+
# requests:
72+
# cpu: "1"
73+
# limits:
74+
# cpu: "1"
75+
ingress:
76+
hosts:
77+
- hub.2i2c.mybinder.org
78+
tls:
79+
- secretName: kubelego-tls-hub
80+
hosts:
81+
- hub.2i2c.mybinder.org
82+
83+
imageCleaner:
84+
# Use 40GB as upper limit, size is given in bytes
85+
imageGCThresholdHigh: 40e9
86+
imageGCThresholdLow: 30e9
87+
imageGCThresholdType: "absolute"
88+
89+
grafana:
90+
ingress:
91+
hosts:
92+
- grafana.2i2c.mybinder.org
93+
tls:
94+
- hosts:
95+
- grafana.2i2c.mybinder.org
96+
secretName: kubelego-tls-grafana
97+
datasources:
98+
datasources.yaml:
99+
apiVersion: 1
100+
datasources:
101+
- name: prometheus
102+
orgId: 1
103+
type: prometheus
104+
url: https://prometheus.2i2c.mybinder.org
105+
access: direct
106+
isDefault: true
107+
editable: false
108+
# persistence:
109+
# storageClassName: csi-cinder-high-speed
110+
111+
prometheus:
112+
server:
113+
persistentVolume:
114+
size: 50Gi
115+
retention: 30d
116+
ingress:
117+
hosts:
118+
- prometheus.2i2c.mybinder.org
119+
tls:
120+
- hosts:
121+
- prometheus.2i2c.mybinder.org
122+
secretName: kubelego-tls-prometheus
123+
124+
ingress-nginx:
125+
controller:
126+
replicas: 1
127+
scope:
128+
enabled: true
129+
service:
130+
loadBalancerIP: 116.203.245.43
131+
132+
static:
133+
ingress:
134+
hosts:
135+
- static.2i2c.mybinder.org
136+
tls:
137+
secretName: kubelego-tls-static

config/prod.yaml

+8-2
Original file line numberDiff line numberDiff line change
@@ -228,10 +228,16 @@ federationRedirect:
228228
weight: 0
229229
health: https://gke.mybinder.org/health
230230
versions: https://gke.mybinder.org/versions
231-
gesis:
231+
hetzner-2i2c:
232232
prime: true
233-
url: https://notebooks.gesis.org/binder
233+
url: https://2i2c.mybinder.org
234234
weight: 60
235+
health: https://2i2c.mybinder.org/health
236+
versions: https://2i2c.mybinder.org/versions
237+
gesis:
238+
prime: false
239+
url: https://notebooks.gesis.org/binder
240+
weight: 40
235241
health: https://notebooks.gesis.org/binder/health
236242
versions: https://notebooks.gesis.org/binder/versions
237243
ovh2:

deploy.py

+12-17
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,9 @@
3030
"prod": "us-central1",
3131
}
3232

33+
# Projects using raw KUBECONFIG files
34+
KUBECONFIG_CLUSTERS = {"ovh2", "hetzner-2i2c"}
35+
3336
# Mapping of config name to cluster name for AWS EKS deployments
3437
AWS_DEPLOYMENTS = {"curvenote": "binderhub"}
3538

@@ -100,17 +103,15 @@ def setup_auth_azure(cluster, dry_run=False):
100103
print(stdout)
101104

102105

103-
def setup_auth_ovh(release, cluster, dry_run=False):
106+
def setup_auth_kubeconfig(release, cluster, dry_run=False):
104107
"""
105-
Set up authentication with 'ovh' K8S from the ovh-kubeconfig.yml
108+
Setup authentication with a pure kubeconfig file
106109
"""
107-
print(f"Setup the OVH authentication for namespace {release}")
110+
print(f"Setup authentication for namespace {release} with kubeconfig")
108111

109-
ovh_kubeconfig = os.path.join(ABSOLUTE_HERE, "secrets", f"{release}-kubeconfig.yml")
110-
os.environ["KUBECONFIG"] = ovh_kubeconfig
111-
print(f"Current KUBECONFIG='{ovh_kubeconfig}'")
112-
stdout = check_output(["kubectl", "config", "use-context", cluster], dry_run)
113-
print(stdout)
112+
kubeconfig = os.path.join(ABSOLUTE_HERE, "secrets", f"{release}-kubeconfig.yml")
113+
os.environ["KUBECONFIG"] = kubeconfig
114+
print(f"Current KUBECONFIG='{kubeconfig}'")
114115

115116

116117
def setup_auth_gcloud(release, cluster=None, dry_run=False):
@@ -436,13 +437,7 @@ def main():
436437
argparser.add_argument(
437438
"release",
438439
help="Release to deploy",
439-
choices=[
440-
"staging",
441-
"prod",
442-
"ovh",
443-
"ovh2",
444-
"curvenote",
445-
],
440+
choices=["staging", "prod", "ovh", "ovh2", "curvenote", "hetzner-2i2c"],
446441
)
447442
argparser.add_argument(
448443
"--name",
@@ -511,8 +506,8 @@ def main():
511506
# script is running on CI, proceed with auth and helm setup
512507

513508
if args.stage in ("all", "auth"):
514-
if cluster.startswith("ovh"):
515-
setup_auth_ovh(args.release, cluster, args.dry_run)
509+
if cluster in KUBECONFIG_CLUSTERS:
510+
setup_auth_kubeconfig(args.release, cluster, args.dry_run)
516511
patch_coredns(args.dry_run, args.diff)
517512
elif cluster in AZURE_RGs:
518513
setup_auth_azure(cluster, args.dry_run)

docs/source/deployment/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@ Deployment and Operation
88
prereqs
99
how
1010
what
11+
k3s

docs/source/deployment/k3s.md

+85
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# Deploy a new mybinder.org federation member on a bare VM with `k3s`
2+
3+
[k3s](https://k3s.io/) is a popular kubernetes distribution that we can use
4+
to build _single node_ kubernetes installations that satisfy the needs of the
5+
mybinder project. By focusing on the simplest possible kubernetes installation,
6+
we can get all the benefits of kubernetes (simplified deployment, cloud agnosticity,
7+
unified tooling, etc) **except** autoscaling, and deploy **anywhere we can get a VM
8+
with root access**. This is vastly simpler than managing an autoscaling kubernetes
9+
cluster, and allows expansion of the mybinder federation in ways that would otherwise
10+
be more difficult.
11+
12+
## VM requirements
13+
14+
The k3s project publishes [their requirements](https://docs.k3s.io/installation/requirements?),
15+
but we have a slightly more opinionated list.
16+
17+
1. We must have full `root` access.
18+
2. Runs latest Ubuntu LTS (currently 24.04). Debian is acceptable.
19+
3. Direct internet access, inbound (public IP) and outbound.
20+
4. "As big as possible", as we will be using all the capacity of this one VM
21+
5. Ability to grant same access to the VM to all the operators of the mybinder federation.
22+
23+
## Installing `k3s`
24+
25+
We can use the [quickstart](https://docs.k3s.io/quick-start) on the `k3s` website, with the added
26+
config of _disabling traefik_ that comes built in. We deploy nginx as part of our deployment, so we
27+
do not need traefik.
28+
29+
1. Create a Kubelet Config file in `/etc/kubelet.yaml` so we can
30+
tweak various kubelet options, including maximum number of pods on a single
31+
node:
32+
33+
```yaml
34+
apiVersion: kubelet.config.k8s.io/v1beta1
35+
kind: KubeletConfiguration
36+
maxPods: 300
37+
```
38+
39+
We will need to develop better intuition for how many pods per node, but given we offer about
40+
450M of RAM per user, and RAM is the limiting factor (not CPU), let's roughly start with the
41+
following formula to determine this:
42+
43+
maxPods = 1.75 \* amount of ram in GB
44+
45+
This adds a good amount of margin. We can tweak this later
46+
47+
2. Install `k3s`!
48+
49+
```bash
50+
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server --kubelet-arg=config=/etc/kubelet.yaml" sh -s - --disable=traefik
51+
```
52+
53+
This runs for a minute, but should set up latest `k3s` on that node! You can verify that by running
54+
`kubectl get node` and `kubectl version`.
55+
56+
## Extracting authentication information via a `KUBECONFIG` file
57+
58+
Follow https://docs.k3s.io/cluster-access#accessing-the-cluster-from-outside-with-kubectl
59+
60+
## Setup DNS entries
61+
62+
There's only one IP to set DNS entries for - the public IP of the VM. No loadbalancers or similar here.
63+
64+
mybinder.org's DNS is managed via Cloudflare. You should have access, or ask someone in the mybinder team who does!
65+
66+
Add the following entries:
67+
68+
- An `A` record for `X.mybinder.org` pointing to wards the public IP. `X` should be an organizational identifier that identifies and thanks whoever is donating this.
69+
- Another `A` record for `*.X.mybinder.org` to the same public IP
70+
71+
Give this a few minutes because it may take a while to propagate.
72+
73+
## Make a config copy for this new member
74+
75+
TODO
76+
77+
## Make a secret config for this new member
78+
79+
TODO
80+
81+
## Deploy binder!
82+
83+
## Test and validate
84+
85+
## Add to the redirector

mybinder/templates/netpol.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ spec:
7373
to:
7474
- podSelector:
7575
matchLabels:
76-
app: nginx-ingress
77-
component: controller
76+
app.kubernetes.io/component: controller
77+
app.kubernetes.io/name: ingress-nginx
7878

7979
{{- end }}
+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{{- if .Values.registry.enabled }}
2+
apiVersion: v1
3+
kind: ConfigMap
4+
metadata:
5+
name: registry-config
6+
labels:
7+
app: registry
8+
heritage: {{ .Release.Service }}
9+
release: {{ .Release.Name }}
10+
data:
11+
config.yml: |
12+
{{ .Values.registry.config | toJson }}
13+
{{- end }}

0 commit comments

Comments
 (0)