This repository is a collection of tooling, docs and configuration that defines my homelab and specific purpose nodes. Currently this boils down to:
homeserver
cluster, serving general-purpose applicationshomeserver-backup
cluster, keeping backups from the aboveprintserver
, which enables wireless access to a USB-only printer
Key takeaways:
ingress-nginx
on entry, withcert-manager
+Let's Encrypt
(DNS based challenges in Route53) backed SSL,oauth2-proxy
for non-OIDC-native servicesvault
for centralized identity and secrets managementlonghorn
used for storage with daily backups of "important" volumes in a separate cluster
Clusters live in a separate cluster
VLAN defined in a network_layout repository.
Mentioned repo also defines the VPN setup and IPs assignment.
All traffic to *.<DOMAIN>
and *.backup.<DOMAIN>
is redirected to specific cluster LBs on a router/VPN level.
Three Dell Optiplex nodes, totaling 18 cores, 192G of RAM and 6T of storage (1x2T NVMe on each node).
Nodes mounted in a 10″ rack using the 3d printed frames with minor modifications (TODO: upstream model changes).
Three RPis 4B, totaling 12 cores, 24G of RAM and 3T of storage (1x1T M.2 SATA attached over USB on each node).
Pis are mounted in a 10″ rack using the 3d printed frames. Power is provided via official PoE+ hats.
RPi Zero 2 W, with OTG splitter for a USB-A type port.
In other words, what needs to be done when you lay your hands on a new machine. As a rule of thumb this only has to be done once.
- Update the bootloader and make it boot from the USB first.
RPi Imager
>Bootloader
>USB Boot
- Flash the official Raspberry Pi OS (64-bit) image and make sure that it works fine. You can use this step to run
raspi-config
and set WLAN country
- Run extended diagnostic suite
- Update BIOS
- Run extended diagnostic suite again
We want to use a bootstrap image that is ready to be provisioned with ansible
without requiring any user interaction first.
To achieve it, we need to:
- create bootstrap user
ansible_bootstrap
with passwordlesssudo
privileges - provide public SSH key to be added to
authorized_keys
- setup minimal required SSH hardening (deny password authentication, deny root login, only allow public key based logins)
Scripts are provided to prepare such image
Currently Raspbian
is used for RPi nodes (because of the OOTB support for PoE+ hat fans), while Dell nodes use Debian
.
Take a look at corresponding build_
scripts in image_build
directory for more details.
Few useful variables:
HOST_SSH_PUB_KEYS_FILE
points to a pubkey that should be added toauthorized_keys
on the targetLUKS_PASSWORD
(build_debian
specific) if provided, will be used for full disk encryption. Defaults to obtaining the password from password manager
Required packages on host for the build to succeed:
vagrant
(builds are performed in VMs for better interoperability)ssh
Built images can be found in image_build/output
directory.
If you were to use an official image you would have to perform the user, SSH and (optionally) LUKS setup manually.
In later steps Ansible will make sure that SSH config is properly hardened and ansible_bootstrap
user is removed.
When you have the image on hand you can flash it to the drive using the tool of your choice, e.g. with dd
# dd if=<path to the image> of=<path to your SSD> bs=64k oflag=dsync status=progress
or using a tool like rufus
or etcher
.
Create a bootable USB drive or upload the file to a TFTP server to perform netboot. Afterwards, install the system as usual. Beware, you have to use the Install (not graphical) option for the preseed file to be taken into account.
The preseed file responsible for the initial setup is burned into the image itself.
This part is responsible for most of the software provisioning.
The idea is to ensure that core blocks are in place, for example:
- users
- firewall
- access restriction, e.g. via SSH
- required dependencies
- container runtime
This step also removes the ansible_bootstrap
user and initializes Kubernetes clusters.
To provision the nodes:
- Enter the
ansible
directory - Set up the workspace with
poetry install
- Get dependencies via
poetry run ansible-galaxy install -r requirements.yml
- Run the
poetry run ansible-playbook site.yml
Take a look at inventory.yml
and site.yml
for supported options.
Most notably passwords that will be set for the newly created users are obtained from the password manager by default.
At the very beginning obtain kubeconfig via scp server@<node>:/etc/rancher/{rke2/k3s}/{rke2/k3s}.yaml kubeconfig.yaml
.
You will have to modify the server
field in the kubeconfig so it points to a remote node and not 127.0.0.1
(which is the default).
It's assumed that homeserver
and homeserver_backup
have corresponding contexts created under
the names homeserver
and homeserver-backup
respectively
Required tools:
kubectl
helm
helmfile
terragrunt
terraform
Few important charts that will be deployed in this step:
cert-manager
for certificates generation (Route53 DNS solver under the hood)ingress-nginx
for reverse proxyingvictoria-metrics-k8s-stack
for monitoring, configured with PagerDuty and Dead Man's Snitchvault
for secrets and identity managementoauth2-proxy
for OIDC support for applications that do not support it nativelylonghorn
for distributed storage
All the cluster related configuration is stored under helmfile
directory.
Different directories are to be used depending on the cluster.
Below instructions define how to perform a full (from scratch) deployment
- cd to
helmfile/core
- run
DOMAIN=<your domain> helmfile sync
- cd to
helmfile/vault-terraform
- run
terragrunt apply
- cd to
helmfile/services
- run
DOMAIN=<your domain> helmfile sync
While the steps above cover the deployment, there's some special treatment needed to initialize vault from scratch.
Please follow the helmfile/vault-terraform/vault-setup.md.
Make sure that you have provided the required values for helmfile/vault-terraform/terraform.tfvars
.
This cluster largely depends on the homeserver
setup, e.g. for auth.
Make sure that the above cluster is deployed and ready first
- cd to
helmfile/backup
- run
DOMAIN=<your domain> helmfile sync
Currently the ansible playbook takes care of:
- setting up the CUPS server
- installing (properiatary) drivers for HP LaserJet Pro P1102 printer
It requires the printer to be connected to the device when the playbook is being applied.