Files
lab/config-format-research.md

122 lines
4.6 KiB
Markdown
Raw Normal View History

2026-03-15 23:50:43 +00:00
# Configuration Format Research
## Decision: PENDING — exploring alternatives to raw Kubernetes YAML
## The Problem
Kubernetes YAML is verbose, repetitive, lacks type safety, and forces users to specify
every layer of concern (intent, team defaults, org standards, k8s boilerplate) in one file.
Helm "solves" this with Go templating, which produces unreadable template spaghetti.
Docker Compose is the gold standard for UX — 6 lines vs 35 for the same deployment.
The problem was never YAML itself; it was being forced to write too much of it.
## Core Design Principle
Users should only define what they care about. Everything else should be inherited from
expert-defined defaults. YAML (or JSON) can exist underneath as:
- Easy, non-binary backup format
- Live editing capability
- Debugging / inspection output
## Layered Architecture
```
Layer 1: User intent "I want an api service running myapp" ← USER WRITES THIS
Layer 2: Team defaults "Our services get health checks, limits" ← Team lead defines
Layer 3: Org standards "All pods need security context, labels" ← Platform team defines
Layer 4: Output Full YAML/JSON for kubectl, backup, debug ← GENERATED
```
Docker Compose feels good because it's only Layer 1 — Docker handles the rest.
Kubernetes forces all 4 layers into one file.
## Evaluated Alternatives
### Tier 1 — Strong Contenders
**Pkl (Apple)**
- Best syntax for "amend a template" via `amends` keyword
- Strong static typing, clean readable syntax
- Lowest ceremony for simple cases
- Risk: Apple may abandon it, requires JVM runtime
- K8s support: `pkl-k8s` package exists
**KCL (CNCF Sandbox)**
- Python-like syntax, lowest learning curve of typed options
- Schema defaults, validation, constraints built in
- CNCF backing gives legitimacy
- Risk: primarily driven by Ant Group (Alibaba)
**CUE**
- Most principled — constraint-based unification, not inheritance
- Used by Timoni (Helm replacement), KubeVela, Dagger
- Defaults marked with `*`, types and values on same spectrum
- Risk: steep learning curve, novel paradigm
- Most mature K8s ecosystem of the three
### Tier 2 — Viable But Weaker Fit
**CDK8s+ (TypeScript)**
- Full IDE support, strongest type safety
- cdk8s+ has intent-driven APIs ("I want a web service" → generates Deployment+Service)
- Risk: brings software engineering complexity into config, AWS-centric
- Good if team is TypeScript-native
**Jsonnet (via Tanka)**
- Proven at scale (Grafana uses it across hundreds of services)
- Object mixins via `+` operator for composition
- Risk: weak type safety, no compile-time validation of field names
### Tier 3 — Not Recommended
**Dhall** — strongest type safety but Haskell-like syntax, small/stale community
**Nickel** — elegant contracts system but tiny K8s ecosystem
**Starlark** — no type safety, no schema system, just a scripting layer
**HCL** — great for infra provisioning, wrong fit for k8s manifests
### Dead Projects
- **Winglang** — shut down April 2025
- **Klotho** — archived, pivoted to InfraCopilot
- **Acorn** — pivoted to AI agents (Obot)
## Compose-Like Input Format (Preferred Direction)
The user prefers Docker Compose brevity. The tool we build could use a Compose-inspired
input format at Layer 1, generating full k8s manifests + provider-specific resources underneath:
```yaml
# What the user writes
services:
api:
image: myapp:latest
size: medium
ports: [8080]
env:
DB_HOST: postgres
# System generates: full k8s Deployment, Service, NetworkPolicy,
# resource limits, security context, health checks, etc.
```
YAML is fine for Layer 1 if it's short enough. The problem was never the format —
it was the verbosity. Compose proves short YAML works.
## Open Questions
1. Should Layer 1 input be YAML (Compose-like), or a typed language (Pkl/KCL/CUE)?
2. How do team defaults (Layer 2) and org standards (Layer 3) get defined and distributed?
3. Should the render view show the generated YAML diff when changing Layer 1 input?
4. How does this integrate with the Pulumi multi-cloud abstraction layer?
5. Could the input format support both k8s workloads AND infrastructure resources
(VMs, networks, storage) in the same spec?
## GUI/TUI Space — Underserved Opportunity
No tool has achieved significant adoption for visually *defining* infrastructure.
Existing tools (K9s, Lens, Rancher) are for monitoring/management, not authoring.
The ideal: platform engineers define schemas with constraints/defaults,
developers interact with a form/wizard showing only fields they need,
validated config generated underneath. Nobody has built this well yet.