GitOps: Infrastructure as Code Done Right
"We store our Terraform in Git" is not GitOps. It's version control. GitOps is an operational model where:
- The entire system is described declaratively
- The desired state lives in Git
- Changes are made through pull requests
- A controller continuously reconciles actual state with desired state
That last point is what separates GitOps from "configs in a repo."
Why GitOps?
We manage infrastructure for dozens of clients. Before GitOps, deployments looked like this:
Developer → SSH into server → Run commands → Hope it works → Forget what they changed
With GitOps:
Developer → Open PR → Review → Merge → Automated reconciliation → Drift detection
The benefits compound:
- Audit trail — Every change is a Git commit with author, timestamp, and review
- Rollback —
git revertis your "undo" button - Reproducibility — Spin up an identical environment from the same repo
- Drift detection — The controller alerts when reality diverges from Git
- Self-service — Developers can make infrastructure changes via PRs without SSH access
The Two Patterns: Push vs. Pull
Push-based (Traditional CI/CD)
Git Push → CI Pipeline → kubectl apply / terraform apply → Cluster
The pipeline has credentials to modify infrastructure. This works but has drawbacks:
- CI system needs broad access to production
- No continuous reconciliation — drift goes undetected
- Pipeline failures can leave state partially applied
Pull-based (GitOps)
Git Push → Controller (in-cluster) → Detects diff → Reconciles → Cluster matches Git
The controller runs inside the cluster and pulls desired state from Git. This is the GitOps model.
Our GitOps Stack
ArgoCD for Kubernetes
ArgoCD is our primary GitOps controller for Kubernetes workloads. It watches Git repos and ensures the cluster matches.
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: production-api
namespace: argocd
spec:
project: production
source:
repoURL: https://github.com/kicked-ro/infrastructure
targetRevision: main
path: apps/production/api
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true # Delete resources removed from Git
selfHeal: true # Revert manual changes
syncOptions:
- CreateNamespace=true
Key configuration choices:
selfHeal: true— If someone manually edits a resource, ArgoCD reverts it. Git is the source of truth, always.prune: true— Resources deleted from Git are deleted from the cluster. No zombie resources.- Automated sync — Merging to
maintriggers deployment. No manual button clicks.
Terraform + Atlantis for Cloud Resources
For cloud infrastructure (VMs, DNS, networking), we use Terraform with Atlantis for PR-based workflow:
Developer opens PR with Terraform change
→ Atlantis runs `terraform plan`
→ Plan output posted as PR comment
→ Reviewer approves
→ Atlantis runs `terraform apply`
→ State updated, PR merged
No one runs terraform apply locally. Ever.
Renovate for Dependency Updates
Keeping base images, Helm chart versions, and Terraform providers up to date is a full-time job. Renovate automates it:
{
"$schema": "https://docs.renovatebot.com/renovate-schema.json",
"extends": ["config:recommended"],
"kubernetes": {
"fileMatch": ["apps/.+\\.yaml$"]
},
"regexManagers": [
{
"fileMatch": ["apps/.+\\.yaml$"],
"matchStrings": ["image: (?<depName>.*?):(?<currentValue>.*?)\\s"],
"datasourceTemplate": "docker"
}
]
}
Renovate opens PRs for every update. ArgoCD deploys them when merged. Fully automated supply chain.
Repository Structure
We've converged on this structure after much iteration:
infrastructure/
├── apps/
│ ├── production/
│ │ ├── api/
│ │ │ ├── deployment.yaml
│ │ │ ├── service.yaml
│ │ │ └── kustomization.yaml
│ │ ├── web/
│ │ └── workers/
│ └── staging/
│ └── ... (mirrors production)
├── platform/
│ ├── cert-manager/
│ ├── ingress-nginx/
│ ├── monitoring/
│ └── external-secrets/
├── terraform/
│ ├── networking/
│ ├── dns/
│ └── compute/
└── renovate.json
Apps are the workloads. Platform is the shared infrastructure. Terraform is the cloud-layer resources. Each directory is an ArgoCD Application.
Secrets in GitOps
The one thing you can't put in Git: secrets. Our approach:
- Sealed Secrets — Encrypt secrets with a cluster-specific key. The encrypted version lives in Git. Only the cluster can decrypt.
- External Secrets Operator — Reference secrets in Vault/AWS SSM. Git stores the reference, not the value.
We prefer External Secrets Operator for production because secret rotation is automatic.
Lessons Learned
- Start with one app — Don't try to GitOps your entire infrastructure in week one
- Enforce selfHeal — If people can bypass Git, they will, and your source of truth becomes a lie
- PR reviews are mandatory — Even for the senior engineer. Especially for the senior engineer
- Test in staging first — ArgoCD ApplicationSets make multi-environment promotion trivial
- Monitor sync status — An app stuck in "OutOfSync" is a ticking time bomb
The Result
Since adopting GitOps across our infrastructure:
- Zero unauthorized changes to production
- 4-minute average deployment time (merge to running)
- 100% audit trail for every infrastructure change
- One-click disaster recovery (point ArgoCD at the repo, done)
GitOps isn't just a deployment strategy — it's an operational philosophy. If you're still SSH-ing into servers to make changes, let's talk.