Case Study · 002

Kubernetes GitOps Projects: building a multi-tenant control plane

A walkthrough of how I designed and shipped a production-grade GitOps platform on Kubernetes — ArgoCD for continuous delivery, Helm for packaging, Terraform for the substrate, and policy-as-code keeping tenants inside their lane. Useful as a reference architecture, a portfolio piece, or a DevOps project idea you can adapt for a resume.

01 · Problem

Click-ops doesn't scale across teams

A single shared Kubernetes cluster runs faster than ten snowflake ones — until ten teams start pushing changes through the console. Drift between staging and production, mystery RBAC, and silent Helm upgrades turn the cluster into a haunted house. The goal: one git repository as the source of truth, automated reconciliation, and per-tenant guardrails.

02 · Architecture

The GitOps control plane

  • Substrate: Terraform provisions EKS / GKE, IAM, VPC, and the bootstrap namespace.
  • Reconciler: ArgoCD runs the app-of-apps pattern; each tenant owns an Application that points at their folder.
  • Packaging: Helm charts with values overlays per environment (dev / staging / prod).
  • Policy: OPA Gatekeeper + Kyverno enforce namespace quotas, image registries, and required labels.
  • Progressive delivery: Argo Rollouts for canary and blue-green with Prometheus analysis.
  • Secrets: External Secrets Operator pulls from AWS Secrets Manager / GCP Secret Manager.
  • Observability: Prometheus + Grafana + Loki; ArgoCD notifications post deploy events to Slack.
03 · Pipeline

From a pull request to a live pod

  1. Developer opens a PR against the tenant's app repo.
  2. CI builds the image, signs it with cosign, and pushes to the registry.
  3. A bot opens a follow-up PR in the GitOps repo bumping the image tag.
  4. Reviewers approve; merge triggers ArgoCD to sync the new manifest.
  5. Argo Rollouts shifts traffic 10% → 50% → 100% while Prometheus watches the SLO.
  6. If error rate breaches the threshold, the rollout auto-aborts and reverts.
04 · Lessons

What I'd tell someone copying this for their resume

  • Start with one app, one tenant, one environment. The app-of-apps pattern is what scales it later.
  • Pin every chart and image by digest. "latest" is the fastest way to recreate the haunted cluster.
  • Treat the GitOps repo like production code — required reviews, CODEOWNERS, branch protection.
  • Prometheus-driven rollouts catch real regressions; manual canaries don't.
  • Document the tenant onboarding in the repo's README. Future-you is a tenant.
05 · Next

Want the full source?

The reference implementations live on GitHub — browse the repositories or jump back to the selected works archive for adjacent projects (service mesh, eBPF security, FinOps).