Adds tier-based LLM routing so fast local models (vLLM, Ollama) handle
structured tasks while cloud models (Gemini, Anthropic) are reserved for
heavy reasoning. Single-provider configs continue to work via fallback.
- Tier type + ProviderRegistry with assignTier/getProvider/fallback chain
- Multi-provider config format: { providers: [{ name, type, tier, ... }] }
- NamedProvider wrapper for multiple instances of same provider type
- Setup wizard: Simple (legacy) / Advanced (fast+heavy tiers) modes
- Status display: tiered view with /llm/providers endpoint
- Call sites use getProvider('fast') instead of getActive()
- Full backward compatibility with existing single-provider configs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace per-call gemini CLI spawning (~10s cold start each time) with
persistent ACP (Agent Client Protocol) subprocess. First call absorbs
the cold start, subsequent calls are near-instant over JSON-RPC stdio.
- Add AcpClient: manages persistent gemini --experimental-acp subprocess
with lazy init, auto-restart on crash/timeout, NDJSON framing
- Add GeminiAcpProvider: LlmProvider wrapper with serial queue for
concurrent calls, same interface as GeminiCliProvider
- Add dispose() to LlmProvider interface + disposeAll() to registry
- Wire provider disposal into mcplocal shutdown handler
- Add status command spinner with progressive output and color-coded
LLM health check results (green checkmark/red cross)
- 25 new tests (17 ACP client + 8 provider)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>