# DNS Solution Research ## Decision: PowerDNS Authoritative + ExternalDNS ### Why PowerDNS | Feature | PowerDNS | CoreDNS | BIND9 | Technitium | |---------|----------|---------|-------|------------| | REST API | Full | No (needs etcd) | No (nsupdate) | Yes | | Database backend | PostgreSQL/MySQL/SQLite | etcd | Zone files | Custom | | Health-aware DNS | Lua records (ifportup, ifurlup) | No | No | No | | ExternalDNS provider | Yes | Yes (via etcd) | Yes (RFC 2136) | No | | DNSSEC | Yes | Limited | Best | Yes | | Split DNS | dnsdist routing | Corefile blocks | Views (best) | APP records | | Maturity | ISP-grade | K8s-focused | Oldest | Newer | PowerDNS wins on: REST API (critical for Lab), health-check-aware Lua records, database backend for HA, and ExternalDNS integration. ### Architecture ``` Lab Server (control plane) │ │ PowerDNS REST API ▼ ┌───────────────┐ │ PowerDNS │ │ Authoritative│──── PostgreSQL/SQLite backend │ Server │ └───────┬───────┘ │ ┌───────────┼───────────┐ │ │ │ ▼ ▼ ▼ Internal DNS ExternalDNS dnsdist .lab.internal (k8s syncs (split DNS Services/ routing) Ingress) ``` ### How Lab Uses DNS #### Auto-registration on onboard When `lab onboard` completes, Lab calls PowerDNS API: - A record: `.lab.internal → ` - PTR record: `.in-addr.arpa → .lab.internal` - Both created/updated atomically #### Domain claims via labels Labels can claim shared domain names: ```yaml labels: mailserver: dns: records: - type: A name: "{{server.name}}.lab.internal" claims: - name: mail.example.com type: A health_check: { port: 25 } ``` All servers with label `mailserver` contribute to `mail.example.com` round-robin. PowerDNS Lua records remove unhealthy servers automatically. #### IP mobility Lab agent on machine reports IP change → Lab server updates PowerDNS API → A record, PTR, and all claimed domains updated. #### K8s integration ExternalDNS runs in k8s, syncs Service/Ingress records to same PowerDNS instance. Same DNS server serves both bare metal and k8s records. #### Groups claiming domains Groups can claim domains for all member servers: ```yaml groups: production-web: match: labels: [web-frontend] environment: prod dns: claims: - name: www.example.com type: A health_check: { url: "https://{{server.ip}}/healthz" } ``` ### DNS Plugin Interface ```go type DNSPlugin interface { Name() string // Record management CreateRecord(zone, name, recordType string, targets []string, ttl int) error UpdateRecord(zone, name, recordType string, targets []string, ttl int) error DeleteRecord(zone, name, recordType string) error ListRecords(zone string) ([]Record, error) // Health-checked records CreateHealthCheckedRecord(zone, name string, targets []string, check HealthCheck) error // Zone management CreateZone(name string, kind string) error DeleteZone(name string) error } ``` Built-in: - `dns-powerdns` — PowerDNS REST API (primary) - `dns-route53` — AWS Route53 (for cloud deployments) - `dns-rfc2136` — RFC 2136 dynamic updates (BIND/Knot fallback) ### Split DNS Setup Internal zones (`.lab.internal`) served by PowerDNS authoritatively. External queries forwarded upstream (8.8.8.8, ISP DNS). Options: - **dnsdist** (PowerDNS ecosystem) routes by source subnet - **CoreDNS as resolver** — serves internal from PowerDNS, forwards external - **BIND views** — if we need view-based split on same zone (unlikely) ### Evaluated and Not Chosen | Tool | Why Not | |------|---------| | CoreDNS | No REST API, needs etcd intermediary, k8s-focused | | BIND9 | No REST API, nsupdate is cumbersome for automation | | Technitium | No ExternalDNS provider, newer/smaller community | | dnsmasq | Not suitable — caching forwarder, no API, ~1000 client limit | | Knot DNS | No REST API, better as secondary/downstream | ### DNS-as-Code (Optional Layer) For static DNS infrastructure (SOA, NS, MX, base zone config): - **octoDNS** (GitHub) or **DNSControl** (Stack Exchange) - GitOps workflow: PR → review → merge → sync to PowerDNS - Dynamic records (server A records, claims) managed by Lab directly via API - Static records managed via DNS-as-code in Git