A year of Kubernetes

About a year ago I thought it would be a good idea to learn a bit more about Kubernetes. We use Kubernetes as part of our server orchestration at work, and while most of it is abstracted a way, it rarely hurts to know how the various foundational layers are actually like.

At the time, I tried to set up a three-node cluster (two computers at home, one in the cloud), connected together by Tailscale (i.e. via Wireguard). This… kind of worked, but it was super chatty, and when we moved to Seattle we no longer had an unmetered internet connection.

The original setup used microk8s because it was the first option that came up, so to resolve the issue, I just ran a separate microk8s node for the cloud machine (battery) and for the home machine (potato). I’d gotten cert-manager to automatically provision LetsEncrypt certificates, it was easy to deploy new containers; I thought things were great.

Over the course of the year, I learned a bunch of things that I kind of wish I didn’t need to learn:

  • Random bits about calico and Kubernetes' internal networking abstractions
  • The fact that Kubernetes runs as a bunch of eventually consistent control loops, so if you Really Need something to just start, it’s actually annoyingly hard.
  • There aren’t clean logs, anywhere.
  • Sometimes the cluster just reschedules a pod, even though there’s only one node to schedule the pod on. So your maximum availability isn’t necessarily high.
  • The storage provisioning system (PersistentVolumes and PersistentVolumeClaims) is really hard to reason about. I assume the idea here is to let your cloud vendor deal with this for you using their network-backed storage, but it was really common for this to be the thing that kept a pod from starting.
  • Kubernetes is full of certificates, and sometimes they expire. So there’s a whole song and dance to get them refreshed so things work again.

Some bits of it were pretty cool, though. I liked that I could just define a Dockerfile and it would get deployed onto the internet without me needing to handwrite configuration files. The Honeycomb agent system is pretty cool. Automatic SSL configuration (and in general syncing the configuration between the service and the frontend proxy) was very convenient.

What can I use instead

I had a few requirements for the next service orchestration thing:

  1. It needs to use containers, because encapsulation makes things easier
  2. It needs to integrate cleanly with a proxy that can route requests to the right place (e.g. nginx or traefik)
  3. It needs to automatically manage SSL for me
  4. It needs to support putting a given path/virtualhost behind an OAuth barrier
  5. It needs to keep working if I don’t look at it for a few months…
  6. It should consume as little CPU and RAM as possible

Notably, I’m not running any critical infrastructure on these boxes, so there’s no real need for high availability. Requirement (4) suggested that I avoid looking at any distributed orchestration systems.

docker compose seemed to fit the bill: it’s just a fancy script for Docker configuration, and traefik supports service discovery via Docker labels, which meets the first few requirements.

Doing the migration

I don’t run any particularly stateful applications, so the actual migration was a process of figuring out how to write the appropriate compose file for the applications I cared about.

The overall architecture is pretty simple: there’s a bridge network traefik_proxy, which most of the apps run on, and then traefik itself is on that network and additionally has exposed ports 80 and 443 for HTTP and HTTPS.

  traefik_proxy:
    name: traefik_proxy
    driver: bridge
    ipam:
      config:
        - subnet: 192.168.90.0/24

traefik itself is configured to use my Cloudflare Zone key for LetsEncrypt DNS verification by setting the two environment keys

CF_API_EMAIL=$CLOUDFLARE_EMAIL
CF_DNS_API_TOKEN=$CLOUDFLARE_API_KEY

and passing in the appropriate command-line arugments. We also tell traefik to use Docker to find the services, though we need to specify the ports manually.

--certificatesResolvers.default.acme.email=$CLOUDFLARE_EMAIL \
--certificatesResolvers.default.acme.storage=/acme.json \
--certificatesResolvers.default.acme.dnsChallenge.provider=cloudflare \
--certificatesResolvers.default.acme.dnsChallenge.resolvers=1.1.1.1:53,1.0.0.1:53 \

--providers.docker=true \
--providers.docker.endpoint=tcp://socket-proxy:2375 \
--providers.docker.exposedByDefault=false \ 
--providers.docker.network=traefik_proxy \
--providers.docker.swarmMode=false

We disable automatic exposure of new Docker services for safety. I configured traefik-forward-auth as the oauth middleware:

  traefik-forward-auth:
    <<: *common-keys-core
    container_name: traefik-forward-auth
    image: thomseddon/traefik-forward-auth:latest
    command: --whitelist=/* redacted */
    environment:
      - CONFIG=/config
      - COOKIE_DOMAIN=$FQDN
      - INSECURE_COOKIE=false
      - AUTH_HOST=oauth.$FQDN
      - URL_PATH=/_oauth
      - LOG_LEVEL=warn
      - LOG_FORMAT=text
      - LIFETIME=86400
      - SECRET=$OAUTH_SECRET
      - CLIENT_ID=$GOOGLE_CLIENT_ID
      - CLIENT_SECRET=$GOOGLE_CLIENT_SECRET
    labels:
      - "traefik.enable=true"
      ## HTTP Routers
      - "traefik.http.routers.oauth-rtr.tls=true"
      - "traefik.http.routers.oauth-rtr.entrypoints=https"
      - "traefik.http.routers.oauth-rtr.rule=Host(`oauth.$FQDN`)"
      ## Middlewares
      - "traefik.http.routers.oauth-rtr.middlewares=traefik-forward-auth"
      - "traefik.http.middlewares.traefik-forward-auth.forwardauth.address=http://traefik-forward-auth:4181"
      - "traefik.http.middlewares.traefik-forward-auth.forwardauth.authResponseHeaders=X-Forwarded-User"
      - "traefik.http.middlewares.traefik-forward-auth.forwardauth.trustForwardHeader=true"
      ## HTTP Services
      - "traefik.http.routers.oauth-rtr.service=oauth-svc"
      - "traefik.http.services.oauth-svc.loadbalancer.server.port=4181"

As long as the traefik-forward-auth middleware is included, all requests will need a valid cookie, which you can get by using Google’s OAuth support.

Deploying normal, no-oauth-required apps is easy: just specify the container image, and include some traefik configuration to expose the route externally and connect it to the port internally.

  healthcheck:
    <<: *common-keys-apps
    image: ghcr.io/rbtying/minimal-http-responder:v0.1.2
    container_name: healthcheck
    environment:
      TEXT: potato
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.healthcheck-rtr.tls.certResolver=default"
      - "traefik.http.routers.healthcheck-rtr.entrypoints=https"
      - "traefik.http.routers.healthcheck-rtr.rule=Host(`healthcheck.$FQDN`)"
      - "traefik.http.routers.healthcheck-rtr.service=healthcheck-svc"
      - "traefik.http.services.healthcheck-svc.loadbalancer.server.port=2020"

I did run into an issue when deploying vaultwarden: the Docker container for vaultwarden specifies a healthcheck, and traefik doesn’t instantiate the route for containers which haven’t passed the healthcheck yet. This is pretty reasonable, but the healthcheck interval for vaultwarden is set to once per minute – which means that it doesn’t show up for a minute. Changing this to 10s makes things come up near-immediately.

  bitwarden:
    <<: *common-keys-apps
    image: vaultwarden/server:latest
    container_name: bitwarden
    volumes:
      - $DOCKERDIR/appdata/bitwarden/:/data
      - $DOCKERDIR/logs/bitwarden:/logs
    environment:
      - WEBSOCKET_ENABLED=true
      - SIGNUPS_ALLOWED=false
      - LOG_FILE=/logs/vaultwarden.log
    healthcheck:
      interval: 10s
    labels:
      - "traefik.enable=true"
      ## HTTP Routers
      - "traefik.http.routers.bitwarden-rtr.entrypoints=https"
      - "traefik.http.routers.bitwarden-rtr.tls.certResolver=default"
      - "traefik.http.routers.bitwarden-rtr.rule=Host(`bitwarden.$FQDN`) || Host(`bitwarden.aeturnalus.com`)"
      - "traefik.http.routers.bitwarden-ws-rtr.entrypoints=https"
      - "traefik.http.routers.bitwarden-ws-rtr.tls.certResolver=default"
      - "traefik.http.routers.bitwarden-ws-rtr.rule=(Host(`bitwarden.$FQDN`) || Host(`bitwarden.aeturnalus.com`)) && Path(`/notifications/hub`)"
      ## HTTP Services
      - "traefik.http.routers.bitwarden-rtr.service=bitwarden-svc"
      - "traefik.http.services.bitwarden-svc.loadbalancer.server.port=80"
      - "traefik.http.routers.bitwarden-ws-rtr.service=bitwarden-ws-svc"
      - "traefik.http.services.bitwarden-ws-svc.loadbalancer.server.port=3012"

Home Assistant has a slightly different flavor of issue. In order for local connected device discovery to work, the Home Assistant container needs to be on the host network. But, if it’s on the host network, it’s not on the Docker bridge networks, so the default Docker service discovery doesn’t quite work.

What we can do instead is to expose the Home Assistant port on the host, and then configure traefik to use the appropriate port. traefik internally connects to host network services at host.docker.internal, so I also had to add that as an extra_host in the traefik container (mapped to host-gateway on the traefik-proxy network).

  homeassistant:
    container_name: homeassistant
    image: "ghcr.io/home-assistant/home-assistant:stable"
    volumes:
      - $DOCKERDIR/appdata/homeassistant/:/config
      - $DOCKERDIR/appdata/homeassistant/docker/run:/etc/services.d/home-assistant/run
      - /etc/localtime:/etc/localtime:ro
    restart: unless-stopped
    network_mode: host
    environment:
      <<: *default-tz-puid-pgid
      PACKAGES: iputils
    labels:
      - "traefik.enable=true"
      ## HTTP Routers
      - "traefik.http.routers.home-assistant-rtr.tls.certResolver=default"
      - "traefik.http.routers.home-assistant-rtr.entrypoints=https"
      - "traefik.http.routers.home-assistant-rtr.rule=Host(`home-assistant.$FQDN`)"
      ## HTTP Services
      - "traefik.http.routers.home-assistant-rtr.service=home-assistant-svc"
      - "traefik.http.services.home-assistant-svc.loadbalancer.server.port=8124"

In order to test all of this, I first configured traefik to run on differenrt ports (i.e. not 80 and 443) so it wouldn’t conflict with the running Kubernetes ingress. Then, I shut down Kubernetes and re-deployed the docker compose with traefik running on the actual HTTP/HTTPS ports, and things Just Worked.

Pretty cool how you can set up a bunch of services in a couple of hours – containers really do drastically simplify running things in a home lab.