mcpctl

Author	SHA1	Message	Date
Michal	ed21ad1b5a	feat(mcpd+db): durable InferenceTask queue + state machine (v5 Stage 1) The persistence + signaling layer for v5. No integration with the existing in-flight inference path yet — that's Stage 2. This commit just lands the durable queue underneath, with a state machine that mcpd's HTTP handlers, the worker result-POST route, and the GC sweep will all build on. Schema (src/db/prisma/schema.prisma + migration): - New `InferenceTask` model + `InferenceTaskStatus` enum (pending\|claimed\|running\|completed\|error\|cancelled). - Routing fields stored at enqueue time so a later rename of `Llm.poolName` doesn't reroute already-queued work: `poolName` (effective pool key), `llmName` (pinned target), `model`, `tier`. - Worker tracking: `claimedBy` (providerSessionId) + `claimedAt`, cleared on revert. - Bodies as `Json`: requestBody (always set), responseBody (set at completion). Streaming chunks are NOT persisted — too expensive at delta granularity. The final assembled body lands once per task. - Lifecycle timestamps: createdAt, claimedAt, streamStartedAt, completedAt. Plus ownerId (RBAC + audit) and agentId (null for direct chat-llm calls). - Indexes for the hot paths: (status, poolName) for the dispatcher's drain query, claimedBy for the disconnect revert, completedAt for the GC retention sweep, owner/agent for the async API listing. Repository (src/mcpd/src/repositories/inference-task.repository.ts): - CRUD + state transitions as conditional CAS via `updateMany`. Two workers racing to claim the same row both run the UPDATE; whichever the DB serializes first sees affected=1 and gets the row, the loser sees 0 and falls through to the next candidate. No application- level locking required. - findPendingForPools(poolNames[]) for the worker drain on bind. - findHeldBy(claimedBy) for the unbindSession revert. - findStalePending + findExpiredTerminal for the GC sweep. Service (src/mcpd/src/services/inference-task.service.ts): - Owns the in-process EventEmitter that wakes blocked HTTP handlers when a worker POSTs results. The DB row is the source of truth for state; the EventEmitter just signals "go re-read row X" so we don't have to poll. Single-instance assumption for v5; pg LISTEN/NOTIFY is the v6 swap when scaling horizontally — no schema change needed, just replace the emitter wakeup. - waitFor(taskId, timeoutMs) returns { done, chunks }: the terminal promise + an async iterator of streaming deltas. Throws on cancel (clear message) or error (worker's errorMessage propagates) or timeout. Polls the row once at subscribe time so an already- terminal task resolves immediately without waiting for an event that's never coming. - gcSweep flips stale pending rows to error (with a clear message about the timeout) and deletes terminal rows past retention. Defaults: 1h pending timeout, 7d terminal retention; both configurable. Tests: - 6 db-level schema tests (defaults, json roundtrip, drain query shape, claimedBy filter, GC predicate, agentId nullable). - 13 service tests covering enqueue, the CAS race on tryClaim, complete/fail/cancel, idempotent terminal transitions, revertHeldBy on disconnect, and the full waitFor signal lifecycle (immediate resolve, wake on event, chunk streaming, cancel/error/timeout paths). Plus a gcSweep test with a fixed clock. mcpd 881/881 (was 868; +13). db pool-schema 14/14, +6 new inference-task-schema. Pre-existing failures in models.test.ts (Secret FK fixture issue, also fails on main HEAD) are unrelated. Stage 2 (next): VirtualLlmService rewires through this — remove the in-memory pendingTasks map; enqueue creates a row, dispatch picks an active session, the result-route updates the row + emits the wakeup. Worker disconnect reverts; worker bind drains.	2026-04-28 02:14:45 +01:00
Michal	7949e1393d	feat(mcpd+db): Llm.poolName + chat dispatcher pool failover (v4 Stage 1) Adds LB-pool-by-shared-name without introducing a new resource. The existing `Llm.name` stays globally unique; a new optional `poolName` column declares membership in a pool. Multiple Llms sharing a non-null `poolName` stack into one load-balanced pool that the chat dispatcher expands at request time. Effective pool key = `poolName ?? name`. Solo rows (poolName=null) are addressable as a "pool of 1" via their own name, so existing single-Llm agents and YAMLs keep working unchanged. A solo row whose name happens to match an explicit poolName joins the same pool — by design — so an operator can transparently promote an existing Llm to pool seed. Dispatcher (chat.service): prepareContext now resolves a randomly- shuffled list of viable pool candidates (status != inactive) once per turn. runOneInference and streamInference iterate the list on transport-level failure (network, virtual publisher disconnect) until one succeeds or the list is exhausted. Streaming failover only covers "failed before first chunk" — once we've yielded text, we're committed to that backend. Auth/4xx errors surfaced as result.status are NOT retried; siblings with the same key/model would fail identically. When the agent's pinned Llm is itself inactive but a sibling pool member is up, dispatch transparently uses the sibling — that's the whole point. When every member is inactive, prepareContext throws a clear "No active Llm in pool '<key>' (pinned: <name>)" error rather than letting the dispatcher's "exhausted" branch surface it. Tests: - 5 new chat-service tests for pool dispatch / failover / pinned-down / all-inactive (chat-service.test.ts). - 7 new db schema tests for the column, the unique-name invariant, the fallback-to-name semantics, and the solo-name-joins-explicit-pool edge case (llm-pool-schema.test.ts). - mcpd 865/865 (was 860; +5), db pool-schema 7/7, no regressions. Stage 2 (next): HTTP route /api/v1/llms/<name>/members + aggregate pool stats on the existing single-Llm route, CLI POOL column + describe block + --pool-name flag, yaml round-trip.	2026-04-27 22:02:41 +01:00
Michal	9afd24a3aa	feat(db+mcpd): Agent lifecycle + chat.service kind=virtual branch (v3 Stage 1) Two pieces of v3 plumbing — schema + the latent v1 chat.service bug. Schema (db): - Agent gains kind/providerSessionId/lastHeartbeatAt/status/inactiveSince mirroring Llm's v1 lifecycle. Reuses LlmKind / LlmStatus enums; no new types. Existing rows backfill kind=public/status=active so v1 CRUD is unaffected. - @@index([kind, status]) for the GC sweep, @@index([providerSessionId]) for disconnect-cascade lookups. - 4 new prisma-level tests cover defaults, persisting virtual fields, the (kind, status) GC index, and providerSessionId lookups. Total agent-schema tests: 20/20. chat.service (mcpd) — fixes the v1 latent bug: - LlmView's kind is now plumbed through prepareContext as ctx.llmKind. - Two new private helpers, runOneInference / streamInference, branch on ctx.llmKind: 'public' goes through the existing adapter registry, 'virtual' relays through VirtualLlmService.enqueueInferTask (mirrors the route-handler branch from v1 Stage 3). - Streaming bridges VirtualLlmService's onChunk callback API to an async iterator via a small queue + wake pattern. - ChatService gains an optional virtualLlms constructor parameter; main.ts wires it in. Older test wirings without it raise a clear "virtualLlms dispatcher not wired" error when the row is virtual, rather than silently falling through to the public path against an empty URL. This unblocks any Agent (public OR future v3-virtual) pinned to a kind=virtual Llm. Pre-this-stage, those agents 502'd against the empty url field. Tests: 4 new chat-service-virtual-llm.test.ts cover the relay path non-streaming, streaming, missing-dispatcher error, and rejection surfacing. mcpd suite: 841/841 (was 833, +8 across stages 1+v3-Stage-1). Workspace: 2054/2054 across 153 files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 17:07:23 +01:00
Michal	1acd8b58bc	feat(db): Llm.kind discriminator + virtual-provider lifecycle (v1 Stage 1) First step of the virtual-LLM feature. A virtual Llm row is one that gets registered by an mcplocal client rather than created via \`mcpctl create llm\`. Its inference is relayed back through an SSE control channel to the publishing session (mcpd routes added in Stage 3). The lifecycle fields below let mcpd reap stale rows when the publisher goes away. Schema additions: - enum LlmKind (public \| virtual). Default public. - enum LlmStatus (active \| inactive \| hibernating). Default active. hibernating is reserved for v2 wake-on-demand. - Llm.kind, providerSessionId, lastHeartbeatAt, status, inactiveSince. - @@index([kind, status]) for the GC sweep. - @@index([providerSessionId]) for the reconnect lookup. All existing rows backfill with kind=public/status=active so v1 is purely additive — public LLMs ignore the lifecycle columns entirely. 7 new prisma-level assertions in tests/llm-virtual-schema.test.ts cover: defaults, persisting kind=virtual + lifecycle together, the active→inactive flip, hibernating value, enum rejection, the (kind,status) GC index, the providerSessionId reconnect index. mcpd suite still 801/801 (regenerated client) and typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 13:59:44 +01:00
Michal	f60f00f1fd	feat(db): add personalities + agent-direct prompts schema (Stage 1) A Personality is a named overlay on top of an Agent — same agent, same LLM, but a different bundle of prompts injected into the system block at chat time. VLAN-on-ethernet semantics: ethernet still works without VLAN; with a VLAN tag, frames are segmented but still ethernet. Schema additions: - Prompt.agentId (nullable FK + index, cascade on delete) so prompts can attach directly to an agent without going through a project. - Personality { id, name, description, agentId, priority } with unique (name, agentId). - PersonalityPrompt join table with per-binding priority override. - Agent.defaultPersonalityId (SetNull on delete) so an agent can pick one personality as the default when no --personality flag is passed. Backwards-compatible by construction: every new column is nullable; existing rows are valid as-is; the chat.service systemBlock changes land in Stage 3. 8 new prisma-level assertions in agent-schema.test.ts cover unique constraints, cascade behavior, the SetNull on defaultPersonalityId, and shared-prompt-across-personalities. All 16 db tests pass; mcpd typecheck + 777 mcpd unit tests still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 19:12:22 +01:00
Michal	3726a65f53	feat(agents): add Agent + ChatThread + ChatMessage schema (Stage 1) Introduces the persistence layer for the upcoming Agent feature: an LLM persona pinned to a specific Llm, optionally attached to a Project, with persisted chat threads/messages so conversations survive REPL exits. Constraint shape: - Agent.llm uses ON DELETE RESTRICT — deleting an Llm in active use fails. - Agent.project uses ON DELETE SET NULL — agents survive project deletion. - ChatThread → ChatMessage cascade so deleting an agent purges its history. - ChatMessage @@unique([threadId, turnIndex]) gives append ordering even under racing writers (services retry on collision). LiteLLM-style per-call overrides will live in Agent.defaultParams (Json); the loose extras Json field is reserved for future LoRA/tool-allowlist work. Pinned vitest fileParallelism=false in @mcpctl/db: all suites share the same Postgres, and adding a second suite exposed FK contention between a clearAllTables in one file and a create in another. Per-test isolation still comes from beforeEach. Tests: 8/8 green in src/db/tests/agent-schema.test.ts (defaults, name uniqueness, llm-in-use Restrict, project-delete SetNull, agent-delete cascade, duplicate (threadId, turnIndex) blocked, tool-call payload round-trip, lastTurnAt DESC ordering). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 16:29:55 +01:00
Michal	0995851810	feat: remove proxyMode — all traffic goes through mcplocal proxy proxyMode "direct" was a security hole (leaked secrets as plaintext env vars in .mcp.json) and bypassed all mcplocal features (gating, audit, RBAC, content pipeline, namespacing). Removed from schema, API, CLI, and all tests. Old configs with proxyMode are accepted but silently stripped via Zod .transform() for backward compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 23:36:36 +00:00
Michal	cfe0d99c8f	fix: exclude db tests from workspace root and fix TS build errors - Exclude src/db/tests from workspace vitest config (needs test DB) - Make global-setup.ts gracefully skip when test DB unavailable - Fix exactOptionalPropertyTypes issues in proxymodel-endpoint.ts - Use proper ProxyModelPlugin type for getPluginHooks function Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 00:39:25 +00:00
Michal	03827f11e4	feat: eager vLLM warmup and smart page titles in paginate stage - Add warmup() to LlmProvider interface for eager subprocess startup - ManagedVllmProvider.warmup() starts vLLM in background on project load - ProviderRegistry.warmupAll() triggers all managed providers - NamedProvider proxies warmup() to inner provider - paginate stage generates LLM-powered descriptive page titles when available, cached by content hash, falls back to generic "Page N" - project-mcp-endpoint calls warmupAll() on router creation so vLLM is loading while the session initializes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 19:07:39 +00:00
Michal	69867bd47a	feat: mcpctl v0.0.1 — first public release Some checks are pending CI / lint (push) Waiting to run Details CI / typecheck (push) Waiting to run Details CI / test (push) Waiting to run Details CI / build (push) Blocked by required conditions Details CI / package (push) Blocked by required conditions Details Comprehensive MCP server management with kubectl-style CLI. Key features in this release: - Declarative YAML apply/get round-trip with project cloning support - Gated sessions with prompt intelligence for Claude - Interactive MCP console with traffic inspector - Persistent STDIO connections for containerized servers - RBAC with name-scoped bindings - Shell completions (fish + bash) auto-generated - Rate-limit retry with exponential backoff in apply - Project-scoped prompt management - Credential scrubbing from git history Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-27 17:05:05 +00:00
Michal	c5147e8270	feat: granular RBAC with resource/operation bindings, users, groups - Replace admin role with granular roles: view, create, delete, edit, run - Two binding types: resource bindings (role+resource+optional name) and operation bindings (role:run + action like backup, logs, impersonate) - Name-scoped resource bindings for per-instance access control - Remove role from project members (all permissions via RBAC) - Add users, groups, RBAC CRUD endpoints and CLI commands - describe user/group shows all RBAC access (direct + inherited) - create rbac supports --subject, --binding, --operation flags - Backup/restore handles users, groups, RBAC definitions - mcplocal project-based MCP endpoint discovery - Full test coverage for all new functionality Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 11:05:19 +00:00
Michal	90f3beee50	fix: add missing passwordHash to DB test user factory Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 01:02:41 +00:00
Michal	73fb70dce4	feat: add MCP server templates and deployment infrastructure Introduce a Helm-chart-like template system for MCP servers. Templates are YAML files in templates/ that get seeded into the DB on startup. Users can browse them with `mcpctl get templates`, inspect with `mcpctl describe template`, and instantiate with `mcpctl create server --from-template=`. Also adds Portainer deployment scripts, mcplocal systemd service, Streamable HTTP MCP endpoint, and RPM packaging for mcpctl-local. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 22:24:35 +00:00
Michal	6d9a9f572c	feat: replace profiles with kubernetes-style secrets Replace the confused Profile abstraction with a dedicated Secret resource following Kubernetes conventions. Servers now have env entries with inline values or secretRef references. Env vars are resolved and passed to containers at startup (fixes existing gap). - Add Secret CRUD (model, repo, service, routes, CLI commands) - Server env: {name, value} or {name, valueFrom: {secretRef: {name, key}}} - Add env-resolver utility shared by instance startup and config generation - Remove all profile-related code (models, services, routes, CLI, tests) - Update backup/restore for secrets instead of profiles - describe secret masks values by default, --show-values to reveal Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:40:58 +00:00
Michal	dc45f5981b	feat: implement database schema with Prisma ORM Add PostgreSQL schema with 8 models (User, Session, McpServer, McpProfile, Project, ProjectMcpProfile, McpInstance, AuditLog), comprehensive model tests (31 passing), seed data for default MCP servers, and package exports. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 04:34:05 +00:00

15 Commits