Skip to content

Commit

Permalink
Merge pull request #278 from ricsanfre/node6
Browse files Browse the repository at this point in the history
Adding new node to architecture and moving services
  • Loading branch information
ricsanfre authored Feb 3, 2024
2 parents 9ed53a1 + 1f84417 commit 9e5396f
Show file tree
Hide file tree
Showing 21 changed files with 234 additions and 186 deletions.
4 changes: 2 additions & 2 deletions ansible/external_services.yml
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@
## Install Hashicorp Vault Server

- name: Install Vault Server
hosts: gateway
hosts: vault
gather_facts: true
tags: [vault]
become: true
Expand Down Expand Up @@ -223,7 +223,7 @@

## Load all credentials into Hashicorp Vault Server
- name: Load Vault Credentials
hosts: gateway
hosts: vault
gather_facts: true
tags: [vault, credentials]
become: false
Expand Down
12 changes: 6 additions & 6 deletions ansible/host_vars/gateway.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ dnsmasq_additional_dns_hosts:
s3_server:
desc: "S3 Server"
hostname: s3
ip: 10.0.0.11
ip: 10.0.0.100
elasticsearch:
desc: "Elasticsearch server"
hostname: elasticsearch
Expand All @@ -52,7 +52,7 @@ dnsmasq_additional_dns_hosts:
vault_server:
desc: "Vault server"
hostname: vault
ip: 10.0.0.1
ip: 10.0.0.11
dnsmasq_enable_tftp: true
dnsmasq_tftp_root: /srv/tftp
dnsmasq_additional_conf: |-
Expand All @@ -78,10 +78,8 @@ ntp_allow_hosts: [10.0.0.0/24]
#########################

# tcp 9100 Prometheus (fluent-bit)
# tcp 8200, 8201 Vault server
# udp 69, TFTP server
# TCP 6443 load balancer K3S API
in_tcp_port: '{ ssh, https, http, iscsi-target, 9100, 8200, 8201, 6443 }'
in_tcp_port: '{ ssh, https, http, iscsi-target, 9100 }'
in_udp_port: '{ snmp, domain, ntp, bootps, 69 }'
# tcp 9091 minio server
forward_tcp_port: '{ http, https, ssh, 9091 }'
Expand Down Expand Up @@ -141,8 +139,10 @@ nft_forward_host_rules:
- iifname $wan_interface oifname $lan_interface ip daddr $lan_network tcp dport ssh ct state new accept
230 http from wan:
- iifname $wan_interface oifname $lan_interface ip daddr $lan_network tcp dport {http, https} ct state new accept
240 haproxy from wan:
- iifname $wan_interface oifname $lan_interface ip daddr 10.0.0.11 tcp dport 6443 ct state new accept
250 port-forwarding from wan:
- iifname $wan_interface oifname $lan_interface ip daddr 10.0.0.11 tcp dport 8080 ct state new accept
- iifname $wan_interface oifname $lan_interface ip daddr 10.0.0.12 tcp dport 8080 ct state new accept
# NAT Post-routing
nft_nat_host_postrouting_rules:
005 masquerade lan to wan:
Expand Down
20 changes: 17 additions & 3 deletions ansible/inventory.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ all:
ansible_host: 10.0.0.15
ip: 10.0.0.15
mac: e4:5f:01:d9:ec:5c
node6:
hostname: node6
ansible_host: 10.0.0.16
ip: 10.0.0.16
mac: d8:3a:dd:0d:be:c8
node-hp-1:
hostname: node-hp-1
ansible-host: 10.0.0.20
Expand All @@ -61,7 +66,7 @@ all:
mac: 10:e7:c6:0a:de:8a
raspberrypi:
hosts:
node[1:5]:
node[1:6]:
gateway:
x86:
hosts:
Expand All @@ -70,8 +75,17 @@ all:
children:
k3s_master:
hosts:
node[1:3]:
node[2:4]:
k3s_worker:
hosts:
node[4:5]:
node[5:6]:
node-hp-[1:3]:
bootstrap:
hosts:
node2:
vault:
hosts:
node1:
haproxy:
hosts:
node1:
2 changes: 1 addition & 1 deletion ansible/k3s_bootstrap.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---

- name: Bootstrap Cluster
hosts: node1
hosts: bootstrap
gather_facts: false
become: false

Expand Down
2 changes: 1 addition & 1 deletion ansible/k3s_install.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---

- name: Install load balancer
hosts: gateway
hosts: haproxy
gather_facts: true
tags: [install]
become: true
Expand Down
2 changes: 1 addition & 1 deletion ansible/reset_external_services.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
daemon_reload: true

- name: Clean Vault Installation
hosts: gateway
hosts: vault
become: true
gather_facts: false
tags: [vault]
Expand Down
2 changes: 1 addition & 1 deletion ansible/tasks/vault_kubernetes_auth_method_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
become: false
register: vault_login
changed_when: false
delegate_to: gateway
delegate_to: node1

- name: Get vault token
set_fact:
Expand Down
2 changes: 1 addition & 1 deletion ansible/vars/picluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
k3s_version: v1.28.2+k3s1

# k3s master node VIP (loadbalancer)
k3s_api_vip: 10.0.0.1
k3s_api_vip: 10.0.0.11

# k3s shared token
k3s_token: "{{ vault.cluster.k3s.token }}"
Expand Down
249 changes: 135 additions & 114 deletions design/picluster-architecture.drawio

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions docs/_docs/ansible-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Quick Start Instructions
permalink: /docs/ansible/
description: Quick Start guide to deploy our Raspberry Pi Kuberentes Cluster using cloud-init, ansible playbooks and ArgoCD
last_modified_at: "24-06-2023"
last_modified_at: "06-11-2023"
---

This are the instructions to quickly deploy Kuberentes Pi-cluster using the following tools:
Expand Down Expand Up @@ -115,9 +115,9 @@ Ansible Playbook used for doing the basic OS configuration (`setup_picluster.yml
<br>
LVM configuration is done by `setup_picluster.yml` Ansible's playbook and the variables used in the configuration can be found in `vars/centralized_san/centralized_san_target.yml`: `storage_volumegroups` and `storage_volumes` variables. Sizes of the different LUNs can be tweaked to fit the size of the SSD Disk used. I used a 480GB disk so, I was able to create LUNs of 100GB for each of the nodes.

- **Dedicated disks** setup assumes that all cluster nodes (`node1-5`) have a SSD disk attached that has been partitioned during server first boot (part of the cloud-init configuration) reserving 30Gb for the root partition and the rest of available disk for creating a Linux partition mounted as `/storage`
- **Dedicated disks** setup assumes that all cluster nodes (`node1-6`) have a SSD disk attached that has been partitioned during server first boot (part of the cloud-init configuration) reserving 30Gb for the root partition and the rest of available disk for creating a Linux partition mounted as `/storage`

Final `node1-5` disk configuration is:
Final `node1-6` disk configuration is:

- /dev/sda1: Boot partition
- /dev/sda2: Root filesystem
Expand Down Expand Up @@ -219,7 +219,7 @@ Once `gateway` is up and running the rest of the nodes can be installed and conn

#### Install Raspberry PI nodes

Install Operating System on Raspberry Pi nodes `node1-5`
Install Operating System on Raspberry Pi nodes `node1-6`

Follow the installation procedure indicated in ["Ubuntu OS Installation"](/docs/ubuntu/rpi/) using the corresponding cloud-init configuration files (`user-data` and `network-config`) depending on the storage setup selected. Since DHCP is used there is no need to change default `/boot/network-config` file located in the ubuntu image.

Expand All @@ -230,7 +230,7 @@ Follow the installation procedure indicated in ["Ubuntu OS Installation"](/docs/
{: .table .table-white .border-dark }


In above user-data files, `hostname` field need to be changed for each node (node1-node5).
In above user-data files, `hostname` field need to be changed for each node (node1-node6).


{{site.data.alerts.warning}}**About SSH keys**
Expand Down Expand Up @@ -285,7 +285,7 @@ All Ansible vault credentials (vault.yml) are also stored in Hashicorp Vault

## Configuring OS level backup (restic)

Automate backup tasks at OS level with restic in all nodes (`node1-node5` and `gateway`) running the command:
Automate backup tasks at OS level with restic in all nodes (`node1-node6` and `gateway`) running the command:

```shell
make configure-os-backup
Expand Down
31 changes: 17 additions & 14 deletions docs/_docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Lab Architecture
permalink: /docs/architecture/
description: Homelab architecture of our Pi Kuberentes cluster. Cluster nodes, firewall, and Ansible control node. Networking and cluster storage design.
last_modified_at: "02-07-2023"
last_modified_at: "03-02-2024"
---


Expand All @@ -12,14 +12,12 @@ The home lab I am building is shown in the following picture


A K3S cluster is composed of the following **cluster nodes**:
- 3 master nodes (`node1`, `node2` and `node3`), running on Raspberry Pi 4B (4GB)
- 3 master nodes (`node2`, `node3` and `node4`), running on Raspberry Pi 4B (4GB)
- 5 worker nodes:
- `node4` running on Raspberry Pi 4B (4GB)
- `node5` running on Raspberry Pi 4B (8GB)
- `node5` and `node6`running on Raspberry Pi 4B (8GB)
- `node-hp-1`, `node-hp-2` and `node-hp-3` running on HP Elitedesk 800 G3 (16GB)


A couple of **LAN switches** (8 Gigabit ports + 5 Gigabit ports) used to provide L2 connectivity to the cluster nodes. L3 connectivity and internet access is provided by a router/firewall (`gateway`) running on Raspberry Pi 4B (2GB).
A couple of **LAN switches** (8 Gigabit ports + 5 Gigabit ports) used to provide L2 connectivity to the cluster nodes. L3 connectivity and internet access is provided by a router/firewall (`gateway`) running on Raspberry Pi 4B (2GB).

`gateway`, **cluster firewall/router**, is connected to LAN Switch using its Gigabit Ethernet port. It is also connected to my home network using its WIFI interface, so it can route and filter traffic comming in/out the cluster. With this architecture my lab network can be isolated from my home network.

Expand All @@ -29,7 +27,12 @@ A K3S cluster is composed of the following **cluster nodes**:
- NTP
- DHCP

A load balancer is needed for providing Hight availability to Kubernetes API. In this cases a network load balancer, [HAProxy](https://www.haproxy.org/), will be deployed in `gateway` server.
`node1`, running on Raspberry Pi 4B (4GB), for providing **kubernetes external services**:
- Secret Management (Vault)
- Kuberentes API Load Balancer
- Backup server

A load balancer is needed for providing Hight availability to Kubernetes API. In this cases a network load balancer, [HAProxy](https://www.haproxy.org/), will be deployed in `node1` server.

For automating the OS installation of x86 nodes, a **PXE server** will be deployed in `gateway` node.

Expand All @@ -56,17 +59,17 @@ For building the cluster, using bare metal servers instead of virtual machines,
I have used the following hardware components to assemble Raspberry PI components of the cluster.

- [4 x Raspberry Pi 4 - Model B (4 GB)](https://www.tiendatec.es/raspberry-pi/gama-raspberry-pi/1100-raspberry-pi-4-modelo-b-4gb-765756931182.html) and [1 x Raspberry Pi 4 - Model B (8 GB)](https://www.tiendatec.es/raspberry-pi/gama-raspberry-pi/1231-raspberry-pi-4-modelo-b-8gb-765756931199.html) as ARM-based cluster nodes (1 master node and 5 worker nodes).
- [1 x Raspberry Pi 4 - Model B (2 GB)](https://www.tiendatec.es/raspberry-pi/gama-raspberry-pi/1099-raspberry-pi-4-modelo-b-2gb-765756931175.html) as router/firewall for the lab environment connected via wifi to my home network and securing the access to my lab network.
- [2 x Raspberry Pi 4 - Model B (2 GB)](https://www.tiendatec.es/raspberry-pi/gama-raspberry-pi/1099-raspberry-pi-4-modelo-b-2gb-765756931175.html) as router/firewall for the lab environment connected via wifi to my home network and securing the access to my lab network.
- [4 x SanDisk Ultra 32 GB microSDHC Memory Cards](https://www.amazon.es/SanDisk-SDSQUA4-064G-GN6MA-microSDXC-Adaptador-Rendimiento-dp-B08GY9NYRM/dp/B08GY9NYRM) (Class 10) for installing Raspberry Pi OS for enabling booting from USB (update Raspberry PI firmware and modify USB partition)
- [4 x Samsung USB 3.1 32 GB Fit Plus Flash Disk](https://www.amazon.es/Samsung-FIT-Plus-Memoria-MUF-32AB/dp/B07HPWKS3C)
- [1 x Kingston A400 SSD Disk 480GB](https://www.amazon.es/Kingston-SSD-A400-Disco-s%C3%B3lido/dp/B01N0TQPQB)
- [4 x Kingston A400 SSD Disk 240GB](https://www.amazon.es/Kingston-SSD-A400-Disco-s%C3%B3lido/dp/B01N5IB20Q)
- [5 x Startech USB 3.0 to SATA III Adapter](https://www.amazon.es/Startech-USB3S2SAT3CB-Adaptador-3-0-2-5-negro/dp/B00HJZJI84) for connecting SSD disk to USB 3.0 ports.
- [5 x Kingston A400 SSD Disk 240GB](https://www.amazon.es/Kingston-SSD-A400-Disco-s%C3%B3lido/dp/B01N5IB20Q)
- [6 x Startech USB 3.0 to SATA III Adapter](https://www.amazon.es/Startech-USB3S2SAT3CB-Adaptador-3-0-2-5-negro/dp/B00HJZJI84) for connecting SSD disk to USB 3.0 ports.
- [1 x GeeekPi Pi Rack Case](https://www.amazon.es/GeeekPi-Raspberry-Ventilador-refrigeraci%C3%B3n-disipador/dp/B07Z4GRQGH/ref=sr_1_11). It comes with a stack for 4 x Raspberry Pi’s, plus heatsinks and fans)
- [1 x SSD Rack Case](https://www.aliexpress.com/i/33008511822.html)
- [1 x ANIDEES AI CHARGER 6+](https://www.tiendatec.es/raspberry-pi/raspberry-pi-alimentacion/796-anidees-ai-charger-6-cargador-usb-6-puertos-5v-60w-12a-raspberry-pi-4712909320214.html). 6 port USB power supply (60 W and max 12 A)
- [1 x ANKER USB Charging Hub](https://www.amazon.es/Anker-Cargador-USB-6-Puertos/dp/B00PTLSH9G/). 6 port USB power supply (60 w and max 12 A)
- [6 x USB-C charging cable with ON/OFF switch](https://www.aliexpress.com/item/33049198504.html).
- [7 x USB-C charging cable with ON/OFF switch](https://www.aliexpress.com/item/33049198504.html).


#### x86-based old refurbished mini PC
Expand Down Expand Up @@ -127,7 +130,7 @@ x86 mini PCs has their own integrated disk (SSD disk or NVME). For Raspberry PIs

`gateway` uses local storage attached directly to USB 3.0 port (Flash Disk) for hosting the OS, avoiding the use of less reliable SDCards.

For having better cluster performance `node1-node5` will use SSDs attached to USB 3.0 port. SSD disk will be used to host OS (boot from USB) and to provide the additional storage required per node for deploying the Kubernetes distributed storage solution (Ceph or Longhorn).
For having better cluster performance `node1-node6` will use SSDs attached to USB 3.0 port. SSD disk will be used to host OS (boot from USB) and to provide the additional storage required per node for deploying the Kubernetes distributed storage solution (Ceph or Longhorn).

![pi-cluster-HW-2.0](/assets/img/pi-cluster-2.0.png)

Expand All @@ -136,11 +139,11 @@ For having better cluster performance `node1-node5` will use SSDs attached to US

A cheaper alternative architecture, instead of using dedicated SSD disks for each cluster node, one single SSD disk can be used for configuring a SAN service.

Each cluster node `node1-node5` can use local storage attached directly to USB 3.0 port (USB Flash Disk) for hosting the OS, avoiding the use of less reliable SDCards.
Each cluster node `node1-node6` can use local storage attached directly to USB 3.0 port (USB Flash Disk) for hosting the OS, avoiding the use of less reliable SDCards.

As additional storage (required by distributed storage solution), iSCSI SAN can be deployed instead of attaching an additional USB Flash Disks to each of the nodes.

A SAN (Storage Access Network) can be configured using `gateway` as iSCSI Storage Server, providing additional storage (LUNs) to `node1-node5`.
A SAN (Storage Access Network) can be configured using `gateway` as iSCSI Storage Server, providing additional storage (LUNs) to `node1-node6`.

As storage device, a SSD disk was attached to `gateway` node. This SSD disk was used as well to host the OS.

Expand Down
4 changes: 2 additions & 2 deletions docs/_docs/backup.md
Original file line number Diff line number Diff line change
Expand Up @@ -465,7 +465,7 @@ Velero CLI need to be installed joinly with kubectl. `velero` uses kubectl confi
{{site.data.alerts.important}} k3s config file is located in `/etc/rancher/k3s/k3s.yaml` and it need to be copied into `$HOME/kube/config` in the server where `kubectl` and `velero` is going to be executed.
{{site.data.alerts.end}}

This will be installed in `node1`
This will be installed in `pimaster`

- Step 1: Download latest stable velero release from https://github.com/vmware-tanzu/velero/releases

Expand Down Expand Up @@ -696,7 +696,7 @@ Installation using `Helm` (Release 3):

#### GitOps installation (ArgoCD)

As alternative, for GitOps deployment (ArgoCD), instead of putting minio credentiasl into helm values in plain text, a Secret can be used to store the credentials.
As alternative, for GitOps deployment (ArgoCD), instead of putting minio credentials into helm values in plain text, a Secret can be used to store the credentials.

```yml
apiVersion: v1
Expand Down
2 changes: 1 addition & 1 deletion docs/_docs/basic-os-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ Raspberry PI does not have by default a RTC (real-time clock) keeping the time w
Even when NTP is used to synchronize the time and date, when it boots takes as current-time the time of the first-installation and it could cause problems in boot time when the OS detect that a mount point was created in the future and ask for manual execution of fscsk

{{site.data.alerts.note}}
I have detected this behaviour with my Raspberry PIs when mounting the iSCSI LUNs in `node1-node5` and after rebooting the server, the server never comes up.
I have detected this behaviour with my Raspberry PIs when mounting the iSCSI LUNs in `node1-node6` and after rebooting the server, the server never comes up.
{{site.data.alerts.end}}

As a side effect the NTP synchronizatio will also take longer since NTP adjust the time in small steps.
Expand Down
13 changes: 9 additions & 4 deletions docs/_docs/gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@
title: Cluster Gateway
permalink: /docs/gateway/
description: How to configure a Raspberry Pi as router/firewall of our Kubernetes Cluster providing connectivity and basic services (DNS, DHCP, NTP, SAN).
last_modified_at: "18-06-2023"
last_modified_at: "03-02-2024"
---

One of the Raspeberry Pi (2GB), **gateway**, is used as Router and Firewall for the home lab, isolating the raspberry pi cluster from my home network.
It will also provide DNS, NTP and DHCP services to my lab network. In case of deployment using centralized SAN storage architectural option, `gateway` is providing SAN services also.

This Raspberry Pi (gateway), is connected to my home network using its WIFI interface (wlan0) and to the LAN Switch using the eth interface (eth0).

In order to ease the automation with Ansible, OS installed on **gateway** is the same as the one installed in the nodes of the cluster (**node1-node5**): Ubuntu 22.04 64 bits.
In order to ease the automation with Ansible, OS installed on **gateway** is the same as the one installed in the nodes of the cluster: Ubuntu 22.04 64 bits.


## Storage Configuration
Expand Down Expand Up @@ -529,6 +529,7 @@ For automating configuration tasks, ansible role [**ricsanfre.dnsmasq**](https:/
dhcp-host=e4:5f:01:2f:49:05,10.0.0.13
dhcp-host=e4:5f:01:2f:54:82,10.0.0.14
dhcp-host=e4:5f:01:d9:ec:5c,10.0.0.15
dhcp-host=d8:3a:dd:0d:be:c8,10.0.0.16
# Adding additional DHCP hosts
# Ethernet Switch
Expand All @@ -542,6 +543,7 @@ For automating configuration tasks, ansible role [**ricsanfre.dnsmasq**](https:/
host-record=node3.picluster.ricsanfre.com,10.0.0.13
host-record=node4.picluster.ricsanfre.com,10.0.0.14
host-record=node5.picluster.ricsanfre.com,10.0.0.15
host-record=node6.picluster.ricsanfre.com,10.0.0.16
# Adding additional DNS
# NTP Server
Expand All @@ -554,16 +556,19 @@ For automating configuration tasks, ansible role [**ricsanfre.dnsmasq**](https:/

Additional DNS records can be added for the different services exposed by the cluster. For example:

- S3 service DNS name pointing to `node1`
- S3/Vault service DNS name pointing to `node1`
```
# S3 Server
host-record=s3.picluster.ricsanfre.com,10.0.0.11
# Vault server
host-record=vault.picluster.ricsanfre.com,10.0.0.11
```
- Monitoring DNS service pointing to Ingress Controller IP address (from MetaLB pool)
```
# Monitoring
host-record=monitoring.picluster.ricsanfre.com,10.0.0.100
```

{{site.data.alerts.end}}

- Step 3. Restart dnsmasq service
Expand Down Expand Up @@ -743,7 +748,7 @@ Check time synchronization with Chronyc
## iSCSI configuration. Centralized SAN
`gateway` has to be configured as iSCSI Target to export LUNs mounted by `node1-node5`
`gateway` has to be configured as iSCSI Target to export LUNs mounted by `node1-node6`
iSCSI configuration in `gateway` has been automated developing a couple of ansible roles: **ricsanfre.storage** for managing LVM and **ricsanfre.iscsi_target** for configuring a iSCSI target.
Expand Down
Loading

0 comments on commit 9e5396f

Please sign in to comment.