feat(docs+smoke): LB pool live smoke + virtual-llms.md pool semantics (v4 Stage 3)
Some checks failed
CI/CD / lint (pull_request) Successful in 53s
CI/CD / test (pull_request) Successful in 1m8s
CI/CD / typecheck (pull_request) Successful in 2m53s
CI/CD / smoke (pull_request) Failing after 1m47s
CI/CD / build (pull_request) Successful in 6m20s
CI/CD / publish (pull_request) Has been skipped
Some checks failed
CI/CD / lint (pull_request) Successful in 53s
CI/CD / test (pull_request) Successful in 1m8s
CI/CD / typecheck (pull_request) Successful in 2m53s
CI/CD / smoke (pull_request) Failing after 1m47s
CI/CD / build (pull_request) Successful in 6m20s
CI/CD / publish (pull_request) Has been skipped
Smoke (tests/smoke/llm-pool.smoke.test.ts): two in-process registrars
publish virtual Llms with distinct names but a shared poolName, then:
1. /api/v1/llms/<name>/members surfaces both with the correct
effective pool key, size, activeCount, and per-member kind/status.
2. Chat through an agent pinned to one pool member dispatches across
the pool — verified by running 12 calls and asserting at least
one response from each backend (the random-shuffle selection
would have to hit only-A or only-B in 12 fair coin flips, ~1/2048).
3. Failover: stop one publisher, the surviving member still serves
chat. /members shows the stopped row as inactive immediately
(unbindSession runs synchronously on SSE close).
docs/virtual-llms.md gets a full "LB pools (v4)" section with the
two-field schema model, dispatcher selection + failover semantics,
public + virtual declaration examples, list/describe rendering, the
"pin to specific instance" escape hatch, and an API surface entry
for /members. docs/agents.md cross-link extended.
Tests: full smoke 144/144 (was 141, +3 for the new pool smoke).
Stages 1-3 ship the complete v4 — public and virtual Llms can both
join pools, agents transparently load-balance across them, yaml
round-trip preserves poolName, and the existing single-Llm world
keeps working byte-identically when poolName is null.
This commit is contained in:
@@ -208,5 +208,8 @@ mcpctl chat reviewer
|
||||
extends the same publishing model to **virtual agents** declared in
|
||||
mcplocal config — they show up in `mcpctl get agent` with
|
||||
`KIND=virtual / STATUS=active` and become chat-able via
|
||||
`mcpctl chat <name>` like any other agent.
|
||||
`mcpctl chat <name>` like any other agent. **v4** adds **pools**:
|
||||
Llms sharing a `poolName` stack into one load-balanced pool that
|
||||
the chat dispatcher transparently widens to at request time, with
|
||||
random selection + sequential failover on transport errors.
|
||||
- [chat.md](./chat.md) — `mcpctl chat` flow and LiteLLM-style flags.
|
||||
|
||||
Reference in New Issue
Block a user