Adds LB-pool-by-shared-name without introducing a new resource. The
existing `Llm.name` stays globally unique; a new optional `poolName`
column declares membership in a pool. Multiple Llms sharing a non-null
`poolName` stack into one load-balanced pool that the chat dispatcher
expands at request time.
Effective pool key = `poolName ?? name`. Solo rows (poolName=null) are
addressable as a "pool of 1" via their own name, so existing single-Llm
agents and YAMLs keep working unchanged. A solo row whose name happens
to match an explicit poolName joins the same pool — by design — so an
operator can transparently promote an existing Llm to pool seed.
Dispatcher (chat.service): prepareContext now resolves a randomly-
shuffled list of viable pool candidates (status != inactive) once per
turn. runOneInference and streamInference iterate the list on
transport-level failure (network, virtual publisher disconnect) until
one succeeds or the list is exhausted. Streaming failover only covers
"failed before first chunk" — once we've yielded text, we're committed
to that backend. Auth/4xx errors surfaced as result.status are NOT
retried; siblings with the same key/model would fail identically.
When the agent's pinned Llm is itself inactive but a sibling pool
member is up, dispatch transparently uses the sibling — that's the
whole point. When every member is inactive, prepareContext throws a
clear "No active Llm in pool '<key>' (pinned: <name>)" error rather
than letting the dispatcher's "exhausted" branch surface it.
Tests:
- 5 new chat-service tests for pool dispatch / failover / pinned-down /
all-inactive (chat-service.test.ts).
- 7 new db schema tests for the column, the unique-name invariant, the
fallback-to-name semantics, and the solo-name-joins-explicit-pool
edge case (llm-pool-schema.test.ts).
- mcpd 865/865 (was 860; +5), db pool-schema 7/7, no regressions.
Stage 2 (next): HTTP route /api/v1/llms/<name>/members + aggregate pool
stats on the existing single-Llm route, CLI POOL column + describe
block + --pool-name flag, yaml round-trip.
Two pieces of v3 plumbing — schema + the latent v1 chat.service bug.
Schema (db):
- Agent gains kind/providerSessionId/lastHeartbeatAt/status/inactiveSince
mirroring Llm's v1 lifecycle. Reuses LlmKind / LlmStatus enums; no
new types. Existing rows backfill kind=public/status=active so v1
CRUD is unaffected.
- @@index([kind, status]) for the GC sweep, @@index([providerSessionId])
for disconnect-cascade lookups.
- 4 new prisma-level tests cover defaults, persisting virtual fields,
the (kind, status) GC index, and providerSessionId lookups.
Total agent-schema tests: 20/20.
chat.service (mcpd) — fixes the v1 latent bug:
- LlmView's kind is now plumbed through prepareContext as ctx.llmKind.
- Two new private helpers, runOneInference / streamInference, branch
on ctx.llmKind: 'public' goes through the existing adapter
registry, 'virtual' relays through VirtualLlmService.enqueueInferTask
(mirrors the route-handler branch from v1 Stage 3).
- Streaming bridges VirtualLlmService's onChunk callback API to an
async iterator via a small queue + wake pattern.
- ChatService gains an optional virtualLlms constructor parameter;
main.ts wires it in. Older test wirings without it raise a clear
"virtualLlms dispatcher not wired" error when the row is virtual,
rather than silently falling through to the public path against an
empty URL.
This unblocks any Agent (public OR future v3-virtual) pinned to a
kind=virtual Llm. Pre-this-stage, those agents 502'd against the
empty url field.
Tests: 4 new chat-service-virtual-llm.test.ts cover the relay path
non-streaming, streaming, missing-dispatcher error, and rejection
surfacing. mcpd suite: 841/841 (was 833, +8 across stages 1+v3-Stage-1).
Workspace: 2054/2054 across 153 files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First step of the virtual-LLM feature. A virtual Llm row is one that
gets *registered by an mcplocal client* rather than created via
\`mcpctl create llm\`. Its inference is relayed back through an SSE
control channel to the publishing session (mcpd routes added in
Stage 3). The lifecycle fields below let mcpd reap stale rows when
the publisher goes away.
Schema additions:
- enum LlmKind (public | virtual). Default public.
- enum LlmStatus (active | inactive | hibernating). Default active.
hibernating is reserved for v2 wake-on-demand.
- Llm.kind, providerSessionId, lastHeartbeatAt, status, inactiveSince.
- @@index([kind, status]) for the GC sweep.
- @@index([providerSessionId]) for the reconnect lookup.
All existing rows backfill with kind=public/status=active so v1 is
purely additive — public LLMs ignore the lifecycle columns entirely.
7 new prisma-level assertions in tests/llm-virtual-schema.test.ts
cover: defaults, persisting kind=virtual + lifecycle together, the
active→inactive flip, hibernating value, enum rejection, the
(kind,status) GC index, the providerSessionId reconnect index.
mcpd suite still 801/801 (regenerated client) and typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A Personality is a named overlay on top of an Agent — same agent,
same LLM, but a different bundle of prompts injected into the system
block at chat time. VLAN-on-ethernet semantics: ethernet still works
without VLAN; with a VLAN tag, frames are segmented but still ethernet.
Schema additions:
- Prompt.agentId (nullable FK + index, cascade on delete) so prompts
can attach directly to an agent without going through a project.
- Personality { id, name, description, agentId, priority } with
unique (name, agentId).
- PersonalityPrompt join table with per-binding priority override.
- Agent.defaultPersonalityId (SetNull on delete) so an agent can pick
one personality as the default when no --personality flag is passed.
Backwards-compatible by construction: every new column is nullable;
existing rows are valid as-is; the chat.service systemBlock changes
land in Stage 3.
8 new prisma-level assertions in agent-schema.test.ts cover unique
constraints, cascade behavior, the SetNull on defaultPersonalityId,
and shared-prompt-across-personalities. All 16 db tests pass; mcpd
typecheck + 777 mcpd unit tests still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces the persistence layer for the upcoming Agent feature: an LLM
persona pinned to a specific Llm, optionally attached to a Project, with
persisted chat threads/messages so conversations survive REPL exits.
Constraint shape:
- Agent.llm uses ON DELETE RESTRICT — deleting an Llm in active use fails.
- Agent.project uses ON DELETE SET NULL — agents survive project deletion.
- ChatThread → ChatMessage cascade so deleting an agent purges its history.
- ChatMessage @@unique([threadId, turnIndex]) gives append ordering even
under racing writers (services retry on collision).
LiteLLM-style per-call overrides will live in Agent.defaultParams (Json);
the loose extras Json field is reserved for future LoRA/tool-allowlist work.
Pinned vitest fileParallelism=false in @mcpctl/db: all suites share the
same Postgres, and adding a second suite exposed FK contention between a
clearAllTables in one file and a create in another. Per-test isolation
still comes from beforeEach.
Tests: 8/8 green in src/db/tests/agent-schema.test.ts (defaults, name
uniqueness, llm-in-use Restrict, project-delete SetNull, agent-delete
cascade, duplicate (threadId, turnIndex) blocked, tool-call payload
round-trip, lastTurnAt DESC ordering).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DB is source of truth with git as downstream replica. SSH key generated
on first start, all resource mutations committed as apply-compatible YAML.
Supports manual commit import, conflict resolution (DB wins), disaster
recovery (empty DB restores from git), and timeline branches on restore.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
proxyMode "direct" was a security hole (leaked secrets as plaintext env
vars in .mcp.json) and bypassed all mcplocal features (gating, audit,
RBAC, content pipeline, namespacing). Removed from schema, API, CLI,
and all tests. Old configs with proxyMode are accepted but silently
stripped via Zod .transform() for backward compatibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add userName column to AuditEvent schema with index and migration
- Add GET /api/v1/auth/me endpoint returning current user identity
- AuditCollector auto-fills userName from session→user map, resolved
lazily via /auth/me on first session creation
- Support userName and date range (from/to) filtering on audit events
and sessions endpoints
- Audit console sidebar groups sessions by project → user
- Add date filter presets (d key: all/today/1h/24h/7d) to console
- Add scrolling and page up/down to sidebar navigation
- Tests: auth-me (4), audit-username collector (4), route filters (2),
smoke tests (2)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add warmup() to LlmProvider interface for eager subprocess startup
- ManagedVllmProvider.warmup() starts vLLM in background on project load
- ProviderRegistry.warmupAll() triggers all managed providers
- NamedProvider proxies warmup() to inner provider
- paginate stage generates LLM-powered descriptive page titles when
available, cached by content hash, falls back to generic "Page N"
- project-mcp-endpoint calls warmupAll() on router creation so vLLM
is loading while the session initializes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the full gated session flow and prompt intelligence system:
- Prisma schema: add gated, priority, summary, chapters, linkTarget fields
- Session gate: state machine (gated → begin_session → ungated) with LLM-powered
tool selection based on prompt index
- Tag matcher: intelligent prompt-to-tool matching with project/server/action tags
- LLM selector: tiered provider selection (fast for gating, heavy for complex tasks)
- Link resolver: cross-project MCP resource references (project/server:uri format)
- Prompt summary service: LLM-generated summaries and chapter extraction
- System project bootstrap: ensures default project exists on startup
- Structural link health checks: enrichWithLinkStatus on prompt GET endpoints
- CLI: create prompt --priority/--link, create project --gated/--no-gated,
describe project shows prompts section, get prompts shows PRI/LINK/STATUS
- Apply/edit: priority, linkTarget, gated fields supported
- Shell completions: fish updated with new flags
- 1,253 tests passing across all packages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Support non-containerized MCP servers via externalUrl field and add
streamable-http session management for HA MCP proof of concept.
- Add externalUrl, command, containerPort fields to McpServer schema
- Skip Docker orchestration for external servers (virtual instances)
- Implement streamable-http proxy with Mcp-Session-Id session management
- Parse SSE-framed responses from streamable-http endpoints
- Add command passthrough to Docker container creation
- Create HA MCP example manifest (examples/ha-mcp.yaml)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>