infrastructure-documentation/k8s.md

# K8S node

Installing a K8S node using [scripts from the k3s-host](k3s-host) directory.

## Imaging

Using installimage from the rescue instance.

- `wipefs -fa /dev/nvme*n1`
- `installimage -r no -n hetzner0?`
- Debian bookworm
- `PART / ext4 100G`
- `PART /srv ext4 all`
- ESC 0 + yes
- reboot

Partitioning.

- First disk
  - OS
  - non precious data such as the LXC containers with runners.
- Second disk
  - a partition configured with DRBD

Debian user.

- `ssh root@hetzner0?.forgejo.org`
- `useradd --shell /bin/bash --create-home --groups sudo debian`
- `mkdir -p /home/debian/.ssh ; cp -a .ssh/authorized_keys /home/debian/.ssh ; chown -R debian /home/debian/.ssh`
- in `/etc/sudoers` edit `%sudo   ALL=(ALL:ALL) NOPASSWD:ALL`

## Install helpers

Each node is identifed by the last digit of the hostname.

```sh
sudo apt-get install git etckeeper
git clone https://code.forgejo.org/infrastructure/documentation
cd documentation/k3s-host
cp variables.sh.example variables.sh
cp secrets.sh.example secrets.sh
```

Variables that must be set depending on the role of the node.

- first server node
  - secrets.sh: node_drbd_shared_secret
- other server node
  - secrets.sh: node_drbd_shared_secret
  - secrets.sh: node_k8s_token: content of /var/lib/rancher/k3s/server/token on the first node
  - variables.sh: node_k8s_existing: identifier of the first node (e.g. 5)
- etcd node
  - secrets.sh: node_k8s_token: content of /var/lib/rancher/k3s/server/token on the first node
  - variables.sh: node_k8s_existing: identifier of the first node (e.g. 5)
  - variables.sh: node_k8s_etcd: identifier of the node whose role is just etcd (e.g. 3)

The other variables depend on the setup.

## Firewall

`./setup.sh setup_ufw`

## DRBD

DRBD is [configured](https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#p-work) with:

`./setup.sh setup_drbd`

Once two nodes have DRBD setup for the first time, it can be initialized by [pretending all is in sync](https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-skip-initial-resync) to save the initial bitmap sync since there is actually no data at all.


```sh
sudo drbdadm primary r1
sudo drbdadm new-current-uuid --clear-bitmap r1/0
sudo mount /precious
```

## NFS

`./setup.sh setup_nfs`

On the node that has the DRBD volume `/precious` mounted, set the IP of the NFS server to be used by k8s:

```sh
sudo ip addr add 10.53.101.100/24 dev enp5s0.4001
```

## K8S

For the first node `./setup.sh setup_k8s`. For nodes joining the cluster `./setup.sh setup_k8s 6` where `hetzner06` is an existing node.

- [metallb](https://metallb.universe.tf) instead of the default load balancer because it does not allow for a public IP different from the `k8s` node IP.
  `./setup.sh setup_k8s_metallb`
- [traefik](https://traefik.io/) requests with [annotations](https://github.com/traefik/traefik-helm-chart/blob/7a13fc8a61a6ad30fcec32eec497dab9d8aea686/traefik/values.yaml#L736) specific IPs from `metalldb`.
  `./setup.sh setup_k8s_traefik`
- [cert-manager](https://cert-manager.io/).
  `./setup.sh setup_k8s_certmanager`
- NFS storage class
  `./setup.sh setup_k8s_nfs`

## Forgejo

[forgejo](https://code.forgejo.org/forgejo-helm/forgejo-helm) configuration in [ingress](https://code.forgejo.org/forgejo-helm/forgejo-helm#ingress) for the reverse proxy (`traefik`) to route the domain and for the ACME issuer (`cert-manager`) to obtain a certificate. And in [service](https://code.forgejo.org/forgejo-helm/forgejo-helm#service) for the `ssh` port to be bound to the desired IPs of the load balancer (`metallb`).

```
ingress:
  enabled: true
  annotations:
	# https://cert-manager.io/docs/usage/ingress/#supported-annotations
	# https://github.com/cert-manager/cert-manager/issues/2239
	cert-manager.io/cluster-issuer: letsencrypt-http
	cert-manager.io/private-key-algorithm: ECDSA
	cert-manager.io/private-key-size: 384
	kubernetes.io/ingress.class: traefik
	traefik.ingress.kubernetes.io/router.entrypoints: websecure
  tls:
	- hosts:
		- t1.forgejo.org
	  secretName: tls-forgejo-t1-ingress-http
  hosts:
	- host: t1.forgejo.org
	  paths:
		- path: /
		  pathType: Prefix

service:
  http:
	type: ClusterIP
	ipFamilyPolicy: PreferDualStack
	port: 3000
  ssh:
	type: LoadBalancer
	annotations:
	  metallb.universe.tf/loadBalancerIPs: 188.40.16.47,2a01:4f8:fff2:48::2
	  metallb.universe.tf/allow-shared-ip: "key-to-share-failover"
	ipFamilyPolicy: PreferDualStack
	port: 2222
```

# K8S NFS storage creation

Define the 20GB `forgejo-data` pvc owned by user id 1000.

```sh
./setup.sh setup_k8s_pvc forgejo-data 20Gi 1000
```

[Instruct the forgejo pod](https://code.forgejo.org/forgejo-helm/forgejo-helm#persistence) to use the `forgejo-data` pvc.

```yaml
persistence:
  enabled: true
  create: false
  claimName: forgejo-data
```

 Disaster recovery and maintenance

# When a machine or disk is scheduled for replacement.

* `kubectl drain hetzner05` # evacuate all the pods out of the node to be shutdown
* `kubectl taint nodes hetzner05 key1=value1:NoSchedule` # prevent any pod from being created there (metallb speaker won't be drained, for instance)
* `kubectl delete node hetzner05` # let the cluster know it no longer exists so a new one by the same name can replace it

# Routing the failover IP

When the machine to which the failover IP (failover.forgejo.org) is routed is unavailable or to be shutdown, to the [Hetzner server panel](https://robot.hetzner.com/server), to the IPs tab and change the route of the failover IP to another node. All nodes are configured with the failover IP, there is nothing else to do.

# Manual boot operations

## On the machine that runs the NFS server

* `sudo drbdadm primary r1` # Switch the DRBD to primary
* `sudo mount /precious` # DRBD volume shared via NFS
* `sudo ip addr add 10.53.101.100/24 dev enp5s0.4001` # add NFS server IP

## On the other machines

* `sudo ip addr del 10.53.101.100/24 dev enp5s0.4001` # remove NFS server IP
k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`# K8S node`

			`Installing a K8S node using [scripts from the k3s-host](k3s-host) directory.`

			`## Imaging`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			`Using installimage from the rescue instance.`

			- `wipefs -fa /dev/nvme*n1`
			- `installimage -r no -n hetzner0?`
			`- Debian bookworm`
			- `PART / ext4 100G`
			- `PART /srv ext4 all`
			`- ESC 0 + yes`
			`- reboot`

			`Partitioning.`

			`- First disk`
			`- OS`
			`- non precious data such as the LXC containers with runners.`
			`- Second disk`
			`- a partition configured with DRBD`

			`Debian user.`

			- `ssh root@hetzner0?.forgejo.org`
			- `useradd --shell /bin/bash --create-home --groups sudo debian`
			- `mkdir -p /home/debian/.ssh ; cp -a .ssh/authorized_keys /home/debian/.ssh ; chown -R debian /home/debian/.ssh`
			- in `/etc/sudoers` edit `%sudo ALL=(ALL:ALL) NOPASSWD:ALL`

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`## Install helpers`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			`Each node is identifed by the last digit of the hostname.`

			```sh
			`sudo apt-get install git etckeeper`
			`git clone https://code.forgejo.org/infrastructure/documentation`
			`cd documentation/k3s-host`
			`cp variables.sh.example variables.sh`
			`cp secrets.sh.example secrets.sh`
			```

			`Variables that must be set depending on the role of the node.`

			`- first server node`
			`- secrets.sh: node_drbd_shared_secret`
			`- other server node`
			`- secrets.sh: node_drbd_shared_secret`
			`- secrets.sh: node_k8s_token: content of /var/lib/rancher/k3s/server/token on the first node`
			`- variables.sh: node_k8s_existing: identifier of the first node (e.g. 5)`
			`- etcd node`
			`- secrets.sh: node_k8s_token: content of /var/lib/rancher/k3s/server/token on the first node`
			`- variables.sh: node_k8s_existing: identifier of the first node (e.g. 5)`
			`- variables.sh: node_k8s_etcd: identifier of the node whose role is just etcd (e.g. 3)`

			`The other variables depend on the setup.`

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`## Firewall`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			`./setup.sh setup_ufw`

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`## DRBD`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			`DRBD is [configured](https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#p-work) with:`

			`./setup.sh setup_drbd`

			`Once two nodes have DRBD setup for the first time, it can be initialized by [pretending all is in sync](https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-skip-initial-resync) to save the initial bitmap sync since there is actually no data at all.`


			```sh
			`sudo drbdadm primary r1`
			`sudo drbdadm new-current-uuid --clear-bitmap r1/0`
			`sudo mount /precious`
			```

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`## NFS`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			`./setup.sh setup_nfs`

			On the node that has the DRBD volume `/precious` mounted, set the IP of the NFS server to be used by k8s:

			```sh
			`sudo ip addr add 10.53.101.100/24 dev enp5s0.4001`
			```

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`## K8S`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			For the first node `./setup.sh setup_k8s`. For nodes joining the cluster `./setup.sh setup_k8s 6` where `hetzner06` is an existing node.

			- [metallb](https://metallb.universe.tf) instead of the default load balancer because it does not allow for a public IP different from the `k8s` node IP.
			`./setup.sh setup_k8s_metallb`
			- [traefik](https://traefik.io/) requests with [annotations](https://github.com/traefik/traefik-helm-chart/blob/7a13fc8a61a6ad30fcec32eec497dab9d8aea686/traefik/values.yaml#L736) specific IPs from `metalldb`.
			`./setup.sh setup_k8s_traefik`
			`- [cert-manager](https://cert-manager.io/).`
			`./setup.sh setup_k8s_certmanager`
			`- NFS storage class`
			`./setup.sh setup_k8s_nfs`

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`## Forgejo`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			[forgejo](https://code.forgejo.org/forgejo-helm/forgejo-helm) configuration in [ingress](https://code.forgejo.org/forgejo-helm/forgejo-helm#ingress) for the reverse proxy (`traefik`) to route the domain and for the ACME issuer (`cert-manager`) to obtain a certificate. And in [service](https://code.forgejo.org/forgejo-helm/forgejo-helm#service) for the `ssh` port to be bound to the desired IPs of the load balancer (`metallb`).

			```
			`ingress:`
			`enabled: true`
			`annotations:`
k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`# https://cert-manager.io/docs/usage/ingress/#supported-annotations`
			`# https://github.com/cert-manager/cert-manager/issues/2239`
			`cert-manager.io/cluster-issuer: letsencrypt-http`
			`cert-manager.io/private-key-algorithm: ECDSA`
			`cert-manager.io/private-key-size: 384`
			`kubernetes.io/ingress.class: traefik`
			`traefik.ingress.kubernetes.io/router.entrypoints: websecure`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00			`tls:`
k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`- hosts:`
			`- t1.forgejo.org`
			`secretName: tls-forgejo-t1-ingress-http`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00			`hosts:`
k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`- host: t1.forgejo.org`
			`paths:`
			`- path: /`
			`pathType: Prefix`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			`service:`
			`http:`
k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`type: ClusterIP`
			`ipFamilyPolicy: PreferDualStack`
			`port: 3000`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00			`ssh:`
k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`type: LoadBalancer`
			`annotations:`
			`metallb.universe.tf/loadBalancerIPs: 188.40.16.47,2a01:4f8:fff2:48::2`
			`metallb.universe.tf/allow-shared-ip: "key-to-share-failover"`
			`ipFamilyPolicy: PreferDualStack`
			`port: 2222`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00			```

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`# K8S NFS storage creation`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			Define the 20GB `forgejo-data` pvc owned by user id 1000.

			```sh
			`./setup.sh setup_k8s_pvc forgejo-data 20Gi 1000`
			```

			[Instruct the forgejo pod](https://code.forgejo.org/forgejo-helm/forgejo-helm#persistence) to use the `forgejo-data` pvc.

			```yaml
			`persistence:`
			`enabled: true`
			`create: false`
			`claimName: forgejo-data`
			```

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`Disaster recovery and maintenance`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`# When a machine or disk is scheduled for replacement.`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			* `kubectl drain hetzner05` # evacuate all the pods out of the node to be shutdown
			* `kubectl taint nodes hetzner05 key1=value1:NoSchedule` # prevent any pod from being created there (metallb speaker won't be drained, for instance)
			* `kubectl delete node hetzner05` # let the cluster know it no longer exists so a new one by the same name can replace it

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`# Routing the failover IP`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			`When the machine to which the failover IP (failover.forgejo.org) is routed is unavailable or to be shutdown, to the [Hetzner server panel](https://robot.hetzner.com/server), to the IPs tab and change the route of the failover IP to another node. All nodes are configured with the failover IP, there is nothing else to do.`

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`# Manual boot operations`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`## On the machine that runs the NFS server`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			* `sudo drbdadm primary r1` # Switch the DRBD to primary
			* `sudo mount /precious` # DRBD volume shared via NFS
			* `sudo ip addr add 10.53.101.100/24 dev enp5s0.4001` # add NFS server IP

k8s: reference the scripts and add an intro 2024-10-20 09:31:03 +00:00			`## On the other machines`
split the README into separate files for clarity 2024-10-20 09:24:52 +00:00
			* `sudo ip addr del 10.53.101.100/24 dev enp5s0.4001` # remove NFS server IP