ADR 018: Cloudflare Infrastructure as Code
Context
ADR 016 established Cloudflare Tunnel and Cloudflare Access for internal operator surfaces such as argocd.northlift.net.
That phase still left three manual steps when onboarding a new internal service:
- Add ingress hostname in GitOps values.
- Run cloudflared tunnel route dns manually.
- Create Access application and policy in the Zero Trust dashboard.
This manual split creates operational risk:
- Drift between cluster intent, Cloudflare DNS, and Access policy state.
- Lockout risk if policy edits are made ad hoc and not peer-reviewed.
- Non-repeatable recovery after incident or account migration.
- No single Git history for tunnel routing and identity controls.
We evaluated the following options:
- Continue with dashboard and CLI clickops.
- Use the Cloudflare Kubernetes Operator for tunnel and Access automation.
- Add a dedicated OpenTofu module for Cloudflare and apply via GitHub Actions.
Decision
We adopt option 3 and add terraform/cloudflare as a first-class OpenTofu module.
This module manages:
- Tunnel object lifecycle.
- Tunnel DNS host routing for internal hostnames.
- Access applications.
- Access policies.
Consequences
Positive
- Cloudflare tunnel and Access controls become auditable, reviewable, and reproducible.
- Service onboarding is reduced to Git changes and CI apply.
- Import-first approach prevents accidental replacement of existing tunnel and Access objects.
- Remote state locking prevents concurrent apply corruption.
Negative
- Adds dependency on AWS backend availability for Cloudflare applies.
- Requires OIDC role setup and least-privilege IAM maintenance.
- Incorrect Access policy edits in code can still cause lockout if review is weak.
- Team must maintain provider version compatibility across Cloudflare API changes.
Status
Accepted and implemented in Phase 11.