feat(mcpd+db): Llm.poolName + chat dispatcher pool failover (v4 Stage 1)
Adds LB-pool-by-shared-name without introducing a new resource. The existing `Llm.name` stays globally unique; a new optional `poolName` column declares membership in a pool. Multiple Llms sharing a non-null `poolName` stack into one load-balanced pool that the chat dispatcher expands at request time. Effective pool key = `poolName ?? name`. Solo rows (poolName=null) are addressable as a "pool of 1" via their own name, so existing single-Llm agents and YAMLs keep working unchanged. A solo row whose name happens to match an explicit poolName joins the same pool — by design — so an operator can transparently promote an existing Llm to pool seed. Dispatcher (chat.service): prepareContext now resolves a randomly- shuffled list of viable pool candidates (status != inactive) once per turn. runOneInference and streamInference iterate the list on transport-level failure (network, virtual publisher disconnect) until one succeeds or the list is exhausted. Streaming failover only covers "failed before first chunk" — once we've yielded text, we're committed to that backend. Auth/4xx errors surfaced as result.status are NOT retried; siblings with the same key/model would fail identically. When the agent's pinned Llm is itself inactive but a sibling pool member is up, dispatch transparently uses the sibling — that's the whole point. When every member is inactive, prepareContext throws a clear "No active Llm in pool '<key>' (pinned: <name>)" error rather than letting the dispatcher's "exhausted" branch surface it. Tests: - 5 new chat-service tests for pool dispatch / failover / pinned-down / all-inactive (chat-service.test.ts). - 7 new db schema tests for the column, the unique-name invariant, the fallback-to-name semantics, and the solo-name-joins-explicit-pool edge case (llm-pool-schema.test.ts). - mcpd 865/865 (was 860; +5), db pool-schema 7/7, no regressions. Stage 2 (next): HTTP route /api/v1/llms/<name>/members + aggregate pool stats on the existing single-Llm route, CLI POOL column + describe block + --pool-name flag, yaml round-trip.
This commit is contained in:
@@ -0,0 +1,10 @@
|
||||
-- v4: optional pool key. When NULL, the effective pool key is the row's
|
||||
-- own `name` (pool of 1, identical to pre-v4 behavior). Multiple Llms
|
||||
-- sharing a non-null `poolName` stack into one load-balanced pool that
|
||||
-- the chat dispatcher expands at request time.
|
||||
ALTER TABLE "Llm" ADD COLUMN "poolName" TEXT;
|
||||
|
||||
-- Index covers both the dispatcher's `WHERE poolName = $1` lookup and
|
||||
-- the v4 admin endpoint `GET /api/v1/llms/<name>/members` (which expands
|
||||
-- by effective pool key).
|
||||
CREATE INDEX "Llm_poolName_idx" ON "Llm"("poolName");
|
||||
Reference in New Issue
Block a user