Closes v2 (wake-on-demand). Same shape as v1's stage 6: smoke
exercises the live-cluster path, docs lose the "v2 reserved" caveat
and gain a full wake-recipe section.
Smoke (virtual-llm.smoke.test.ts):
- New "wake-on-demand" describe block runs alongside the v1 tests.
- Spins a tiny in-process HTTP "wake controller"; the published
provider's isAvailable() returns false until the wake POST flips
the bool. Asserts:
1. Provider publishes as kind=virtual / status=hibernating.
2. First inference triggers the wake recipe, the recipe POSTs
to the controller, the provider becomes available, mcpd
relays the inference, and the row settles to active.
- Cleans up the row + wake server in afterAll.
Docs (docs/virtual-llms.md):
- Lifecycle table updates the `hibernating` description from
"reserved for v2" to the actual v2 semantics.
- New "Wake-on-demand (v2)" section: configuration shapes for both
recipe types (http + command), the wake-then-infer flow diagram,
concurrent-infer dedup, failure semantics.
- Roadmap drops v2; v3-v5 still listed.
Workspace: 2050/2050 (smoke runs separately; the new SSE-based wake
test runs only against a live cluster, not under \`pnpm test:run\`).
v2 closes. v3 = virtual agents, v4 = LB pool by model, v5 = queue.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second half of v2. mcpd now dispatches a \`wake\` task on the SSE
control channel when an inference request hits a row whose
status=hibernating, waits for the publisher to confirm readiness,
then proceeds with the infer task. Concurrent infers for the same
hibernating Llm share a single wake task — \`wakeInFlight\` map
dedupes by Llm name.
State machine in enqueueInferTask:
active → push infer task immediately (existing path).
inactive → 503, publisher offline (existing path).
hibernating → ensureAwake() → push infer task (new in v2).
ensureAwake/runWake (private):
- Allocates a fresh taskId on the existing PendingTask plumbing.
- Pushes \`{ kind: "wake", taskId, llmName }\` on the SSE handle.
- Awaits the publisher's result POST. On 2xx, flips the row to
active + bumps lastHeartbeatAt, so all queued + future infers
hit the active path. On non-2xx or service.failTask, the row
stays hibernating (next request retries).
Tests: 4 new in virtual-llm-service.test.ts cover happy path
(wake → infer in order), concurrent dedup (3 parallel infers, 1
wake task), wake failure surfaces to all queued infers and leaves
the row hibernating, inactive ≠ hibernating (still rejects with 503,
no wake attempt). 22/22 service tests, 2050/2050 workspace.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First half of v2 — mcplocal can now declare a hibernating backend and
respond to a `wake` task by running a configured recipe. v2 Stage 2
will wire mcpd to dispatch the wake task before relaying inference.
Config (LlmProviderFileEntry):
- New \`wake\` block on a published provider:
wake:
type: http # or: command
url: ... # http only
method: POST # http only, default POST
headers: {...} # http only
body: ... # http only
command: ... # command only
args: [...] # command only
maxWaitSeconds: 60 # how long to poll isAvailable() after wake fires
Registrar (mcplocal):
- At publish time, providers with a wake recipe whose isAvailable()
returns false report initialStatus=hibernating to mcpd. Without a
wake recipe (legacy v1) or when already up, status stays active.
- handleWakeTask: runs the recipe (HTTP request OR child-process
spawn), then polls isAvailable() up to maxWaitSeconds, sending a
heartbeat each loop so mcpd's GC sweep doesn't time us out
mid-boot. Reports { ok, ms } on success or { error } on
timeout/recipe failure via the existing _provider-task/:id/result.
- Replaces the v1 stub that rejected wake tasks with "not implemented".
mcpd VirtualLlmService:
- RegisterProviderInput gains optional initialStatus ('active' |
'hibernating'). The register/upsert path uses it for both new and
reconnecting rows. Defaults to 'active' so v1 publishers still
work unchanged.
- Provider-register route's coercer accepts the new field.
Tests: 3 new in registrar.test.ts cover initialStatus selection
(hibernating when wake configured + unavailable, active otherwise,
active when no wake even if unavailable). 8/8 registrar tests, 833/833
mcpd unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>