Second half of v2. mcpd now dispatches a \`wake\` task on the SSE
control channel when an inference request hits a row whose
status=hibernating, waits for the publisher to confirm readiness,
then proceeds with the infer task. Concurrent infers for the same
hibernating Llm share a single wake task — \`wakeInFlight\` map
dedupes by Llm name.
State machine in enqueueInferTask:
active → push infer task immediately (existing path).
inactive → 503, publisher offline (existing path).
hibernating → ensureAwake() → push infer task (new in v2).
ensureAwake/runWake (private):
- Allocates a fresh taskId on the existing PendingTask plumbing.
- Pushes \`{ kind: "wake", taskId, llmName }\` on the SSE handle.
- Awaits the publisher's result POST. On 2xx, flips the row to
active + bumps lastHeartbeatAt, so all queued + future infers
hit the active path. On non-2xx or service.failTask, the row
stays hibernating (next request retries).
Tests: 4 new in virtual-llm-service.test.ts cover happy path
(wake → infer in order), concurrent dedup (3 parallel infers, 1
wake task), wake failure surfaces to all queued infers and leaves
the row hibernating, inactive ≠ hibernating (still rejects with 503,
no wake attempt). 22/22 service tests, 2050/2050 workspace.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>