feat: v4 LB pools by shared poolName #69
Reference in New Issue
Block a user
Delete Branch "feat/llm-pool-by-name"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
v4 adds a load-balanced pool model without introducing a new
resource.
Llm.namestays globally unique (the apply key); a newoptional
Llm.poolNamedeclares membership. Multiple Llms sharing anon-null
poolNamestack into one pool that the chat dispatcherexpands at request time and selects from with random + sequential
failover. Solo Llms (poolName=null) work exactly as pre-v4 — the
effective pool key falls back to the row's own name, the pool is
size 1, no failover.
The user direction was explicit: keep
Llm.nameunique, separate"resource name" from "pool name", make pools impossible to mistake
in
mcpctl get llm. That's what landed.Three stages
Stage 1 (
7949e13):poolNameschema + repo + service +chat.servicedispatcher with random selection + transport-failurefailover. New
findByPoolNamereturns members wherepoolName = $1 OR (poolName IS NULL AND name = $1)so solo rowsstay addressable as pool-of-1. 5 new chat-service tests, 7 new db
schema tests.
Stage 2 (
e21f960):GET /api/v1/llms/<name>/membersreturnsmembers + aggregate
size/activeCount. CLI gains aPOOLcolumn right after
NAME, aPool:block indescribe llmwithmember list and "← this row" indicator,
--pool-nameflag oncreate llm, yaml round-trip withpoolName, and shellcompletions. mcplocal
LlmProviderFileEntry+RegistrarPublishedProviderthread
poolNamethrough the register payload, validated server-side with the same regex as
CreateLlmSchema.Stage 3 (
137711f): live smoke against the deployed mcpd —two in-process publishers share a
poolName, agent pinned to onemember dispatches across both (asserted across 12 calls), failover
verified by stopping one publisher and confirming the survivor
serves chat. New "LB pools (v4)" section in
docs/virtual-llms.mdwith declaration examples for public + virtual, dispatcher
semantics, and the API surface entry.
Test plan
get llm shows POOL column, describe shows Pool block + members,
get -o yaml | apply -f -round-trips without diff-in POOL columnand to suppress the Pool block in describe
Smoke (tests/smoke/llm-pool.smoke.test.ts): two in-process registrars publish virtual Llms with distinct names but a shared poolName, then: 1. /api/v1/llms/<name>/members surfaces both with the correct effective pool key, size, activeCount, and per-member kind/status. 2. Chat through an agent pinned to one pool member dispatches across the pool — verified by running 12 calls and asserting at least one response from each backend (the random-shuffle selection would have to hit only-A or only-B in 12 fair coin flips, ~1/2048). 3. Failover: stop one publisher, the surviving member still serves chat. /members shows the stopped row as inactive immediately (unbindSession runs synchronously on SSE close). docs/virtual-llms.md gets a full "LB pools (v4)" section with the two-field schema model, dispatcher selection + failover semantics, public + virtual declaration examples, list/describe rendering, the "pin to specific instance" escape hatch, and an API surface entry for /members. docs/agents.md cross-link extended. Tests: full smoke 144/144 (was 141, +3 for the new pool smoke). Stages 1-3 ship the complete v4 — public and virtual Llms can both join pools, agents transparently load-balance across them, yaml round-trip preserves poolName, and the existing single-Llm world keeps working byte-identically when poolName is null.Smoke (tests/smoke/llm-pool.smoke.test.ts): two in-process registrars publish virtual Llms with distinct names but a shared poolName, then: 1. /api/v1/llms/<name>/members surfaces both with the correct effective pool key, size, activeCount, and per-member kind/status. 2. Chat through an agent pinned to one pool member dispatches across the pool — verified by running 12 calls and asserting at least one response from each backend (the random-shuffle selection would have to hit only-A or only-B in 12 fair coin flips, ~1/2048). 3. Failover: stop one publisher, the surviving member still serves chat. /members shows the stopped row as inactive immediately (unbindSession runs synchronously on SSE close). docs/virtual-llms.md gets a full "LB pools (v4)" section with the two-field schema model, dispatcher selection + failover semantics, public + virtual declaration examples, list/describe rendering, the "pin to specific instance" escape hatch, and an API surface entry for /members. docs/agents.md cross-link extended. Tests: full smoke 144/144 (was 141, +3 for the new pool smoke). Stages 1-3 ship the complete v4 — public and virtual Llms can both join pools, agents transparently load-balance across them, yaml round-trip preserves poolName, and the existing single-Llm world keeps working byte-identically when poolName is null.