# Crossplane Evaluation ## Decision: NOT ADOPTING Crossplane will not be used in this stack. The lack of a plan/preview mechanism is a dealbreaker for enterprise adoption and safe infrastructure management. --- ## Why We Evaluated It The core problem: Terraform/OpenTofu requires re-implementing the same infrastructure concepts per platform (AWS, XCP-ng, bare metal). At thousands of nodes across multiple platforms, this is a massive maintenance burden. Crossplane's XRD/Composition model promised a unified API: ``` XRD: "VirtualMachine" (universal API) ├── Composition: AWS → EC2 instance ├── Composition: XCP-ng → XO VM └── Composition: bare metal → MAAS / Ansible ``` One API, multiple backends — teams request a "VirtualMachine" and the right composition handles it. ## Strengths - **CNCF Graduated** (Nov 2025, v2.2) — Apache 2.0 license, top-tier maturity - **Continuous drift detection** — automatically reverts manual changes, unlike Terraform's on-demand plan/apply - **No state file management** — no remote backends, locking issues, or state corruption - **Kubernetes-native** — works with ArgoCD, Flux, kubectl, RBAC out of the box - **XRDs/Compositions** — genuine multi-platform abstraction layer, solves the "re-implement per cloud" problem - **Eventual consistency** — resources with complex dependencies don't get stuck like Terraform's dependency graph - **Enterprise adoption** — Deutsche Kreditbank, Elastic, Nike, Apple, NASA, Grafana Labs, 60+ orgs - **Deutsche Kreditbank** replaced Terraform; deployments went from weeks to under one hour ## Dealbreaker: No Plan/Preview The single biggest issue. Terraform's `terraform plan` lets operators see exactly what will change before applying. Crossplane applies changes immediately upon resource creation/modification. - Discussed in the community for 2+ years with no resolution - A Kubernetes-native solution would be a `Plan` CRD that shows proposed changes before approval - ArgoCD `sync --dry-run` is a partial workaround but only shows k8s resource diffs, not what the cloud provider will actually do underneath - **For regulated environments and SRE teams at scale, change preview is non-negotiable** Possible reasons it hasn't been implemented: - The continuous reconciliation architecture may make point-in-time snapshots fundamentally hard - Upbound (commercial entity) may be reserving it for their paid platform - Or simply not prioritised ## Other Significant Concerns ### CRD Bloat - `provider-aws` installs 900+ CRDs — can make API server unresponsive for up to an hour (GitHub #2649) - Exceeds Kubernetes' recommended ~500 CRD limit - Mitigated by "Provider Families" (install per-service sub-providers) but requires careful planning ### Debugging Difficulty - Errors propagate through layers: Claim → XR → Composition → Managed Resource → Provider → Cloud API - Multiple sources report debugging compositions is painful - Pipeline Inspector (alpha in v2.2) is being introduced but not production-ready ### Chicken-and-Egg Problem - Crossplane runs inside Kubernetes — cannot provision the cluster it runs on - Requires a "management cluster" bootstrapped by other means (Terraform, Puppet, etc.) - If the management cluster dies, no drift detection or reconciliation runs - Recovery: applying YAMLs to a new cluster works if deterministic resource names are used, otherwise risks creating duplicate cloud resources ### Cluster Loss / Immutability Concerns - State lives in etcd, not a versionable state file - No independent audit trail or easy way to diff historical states - On new cluster: resources with explicit external names get adopted; auto-named resources get duplicated - Need etcd backups as insurance, and deterministic naming everywhere ### Performance at Scale - ~2000 composites took 6+ minutes to reconcile on k3d (GitHub #2256) - Reconciliation interval not easily configurable globally (GitHub #5934) ### YAML Limitations - No native loops, conditionals, or programming constructs - Complex compositions require changes in multiple locations ## XCP-ng Provider Gap - No Crossplane provider for XCP-ng exists today - A mature Terraform provider (`terraform-provider-xenorchestra`) exists, maintained by Vates - Could be wrapped via Upjet to auto-generate a Crossplane provider — but nobody has done it - Would be a greenfield open-source project ## Real Issues Reported - API server unresponsiveness with too many CRDs (GitHub #2649) - CRD scaling issues beyond ~500 CRDs (GitHub #2895) - GCP SQL resources randomly marked for deletion — dangerous for production databases - Reconciliation rate limiting at scale (GitHub #2256) ## Conclusion Crossplane solves a real problem (multi-platform abstraction) that we need, but the lack of plan/preview makes it unsuitable for enterprise-scale production infrastructure management. The operational concerns (CRD bloat, debugging, cluster dependency) add further risk. We need to find an alternative approach to the multi-platform abstraction problem that Crossplane solves, while retaining plan/preview capabilities.