356cbe87b5a915209a290a41f74ead5262c1cbf1
8 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
7320b50dac |
feat(cli+docs+smoke): inference-task CLI + GC ticker + smoke + docs (v5 Stage 4)
Some checks failed
CI/CD / lint (pull_request) Successful in 55s
CI/CD / test (pull_request) Successful in 1m12s
CI/CD / typecheck (pull_request) Successful in 2m46s
CI/CD / smoke (pull_request) Failing after 1m44s
CI/CD / build (pull_request) Failing after 7m0s
CI/CD / publish (pull_request) Has been skipped
CLI surface for the durable queue:
- `mcpctl get tasks` — table view (ID, STATUS, POOL, LLM, MODEL,
STREAM, AGE, WORKER). Aliases `task`, `tasks`, `inference-task`,
`inference-tasks` all normalize to the canonical plural so URL
construction works uniformly. RESOURCE_ALIASES + completions
generator updated.
- `mcpctl chat-llm <name> --async -m <msg>` — enqueue and exit. stdout
is just the task id (pipeable into `xargs mcpctl get task`); stderr
carries human-readable status. REPL mode is rejected for --async
(fire-and-forget doesn't make sense without -m).
GC ticker in mcpd: 5-min interval. Pending tasks past 1 h queue
timeout flip to error with a clear message; terminal tasks past 7 d
retention get deleted. Both queries are index-backed.
Crash fix uncovered by the smoke: when the async route doesn't await
ref.done, a later cancel/error rejected the in-flight Promise as
unhandled and crashed mcpd. The route now attaches a no-op `.catch`
so the legacy `done` semantic still works for sync callers (chat,
direct infer) without taking out the process for async ones. The
EnqueueInferOptions also gained an explicit `ownerId` field so the
async API can stamp the authenticated user on the row instead of
inheriting 'system' from the constructor's resolveOwner — without
this, every GET/DELETE from the original caller would 404 due to
foreign-owner mismatch.
Smoke (tests/smoke/inference-task.smoke.test.ts):
1. POST /inference-tasks while no worker bound → row=pending.
2. Bring a registrar online → bindSession drain claims and
dispatches → worker complete()s → row=completed → GET returns
the assistant body.
3. Stop worker, enqueue, DELETE → row=cancelled, persisted.
docs/inference-tasks.md (new): full data model, lifecycle diagram,
async API reference, CLI examples, RBAC table, GC defaults, and the
v5 limitations / v6 roadmap. Cross-linked from virtual-llms.md and
agents.md.
Tests + smoke: mcpd 893/893, mcplocal 723/723, cli 437/437, full
smoke 146/146 (was 144, +2 new task smoke). Live mcpd verified via
manual curl: enqueue → cancel → re-fetch — no crash, owner scoping
returns 404 on foreign ids, GC ticker logs at info when it sweeps.
v5 complete: durable queue (Stage 1) + VirtualLlmService rewire
(Stage 2) + async API & RBAC (Stage 3) + CLI/GC/smoke/docs (Stage 4).
|
||
|
|
7e6b0cab44 |
feat(cli): mcpctl chat-llm + KIND/STATUS columns (v1 Stage 5)
Closes the loop on user-facing surface: $ mcpctl get llm NAME KIND STATUS TYPE MODEL TIER KEY ID qwen3-thinking public active openai qwen3-thinking fast ... ... vllm-local virtual active openai Qwen/Qwen2.5-7B-Instruct fast - ... $ mcpctl chat-llm vllm-local ──────────────────────────────────────── LLM: vllm-local openai → Qwen/Qwen2.5-7B-Instruct-AWQ Kind: virtual Status: active ──────────────────────────────────────── > hello? Hi! … New: chat-llm command (commands/chat-llm.ts) - Stateless chat with any mcpd-registered LLM. No threads, no tools, no project prompts. POSTs to /api/v1/llms/<name>/infer; mcpd's kind=virtual branch handles relay-through-mcplocal transparently, so the same CLI command works for both public and virtual LLMs. - Reuses installStatusBar / formatStats / recordDelta / styleStats / PhaseStats from chat.ts (now exported) so the bottom-row tokens-per- second ticker behaves identically to mcpctl chat. - Flags: --message (one-shot), --system, --temperature, --max-tokens, --no-stream. Streaming uses OpenAI chat.completion.chunk SSE. - REPL mode keeps a per-session history array so multi-turn flows feel natural; each turn is an independent inference call. Updated: get.ts - LlmRow gains optional kind/status fields. - llmColumns layout: NAME, KIND, STATUS, TYPE, MODEL, TIER, KEY, ID. Defaults gracefully when older mcpd responses don't return them. Updated: chat.ts - Re-exports the helpers chat-llm.ts needs (PhaseStats, newPhase, recordDelta, formatStats, styleStats, styleThinking, STDERR_IS_TTY, StatusBar, installStatusBar). No behavior change. Completions: chat-llm picks up the standard option enumeration automatically; bash gets a special-case for first-arg LLM-name completion via _mcpctl_resource_names "llms". CLI suite: 437/437 (was 430, +7 from auto-discovered test cases in the regenerated completions golden). Workspace: 2043/2043 across 152 files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9050918a83 |
feat(cli): personality flag + create/get/edit/delete personalities (Stage 4)
End-to-end CLI surface for the personality overlay: mcpctl create personality grumpy --agent reviewer --description "be terse" mcpctl create prompt tone --agent reviewer --content "Be very terse." mcpctl get personalities mcpctl get personalities --agent reviewer mcpctl edit personality <id> mcpctl delete personality grumpy --agent reviewer mcpctl chat reviewer --personality grumpy Chat banner gains a "Personality:" line that shows either the active flag value or the agent's `defaultPersonality` (when no flag given), so the user knows which overlay is in effect before sending a message. `--personality` is stripped from `/save` (it's a per-turn override, not a `defaultParams` field — the agent's defaultPersonality lives on its own column and is set via PUT /agents). Backend (small additions to land Stage 4 cleanly): - `GET /api/v1/personalities[?agent=name]` so `mcpctl get personalities` doesn't require an agent filter. - PersonalityService.listAll() aggregates across agents. Completions: regenerated fish + bash. `personalities` added as a canonical resource with `personality` alias; edit-resource list extended; the per-resource argument completers pick up the new type automatically. CLI suite: 430/430. mcpd: 801/801. Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
727e7d628c |
feat(agents): mcpctl chat REPL + agent CRUD + completions (Stage 5)
This is the moment the user can actually talk to an agent end-to-end:
mcpctl create llm qwen3-thinking --type openai --model qwen3-thinking \
--url http://litellm.nvidia-nim.svc.cluster.local:4000/v1 \
--api-key-ref litellm-key/API_KEY
mcpctl create agent reviewer --llm qwen3-thinking --project mcpctl-dev \
--description "I review security design — ask me after each major change."
mcpctl chat reviewer
Pieces:
* src/cli/src/commands/chat.ts (new) — REPL + one-shot. Streams the SSE
endpoint and prints text deltas to stdout as they arrive; tool_call /
tool_result events go to stderr in dim-style brackets so the chat
output stays clean. LiteLLM-style flags (--temperature / --top-p /
--top-k / --max-tokens / --seed / --stop / --allow-tool / --extra)
layer over agent.defaultParams. In-REPL slash-commands: /set KEY VAL,
/system <text>, /tools (list project's MCP servers), /clear (new
thread), /save (PATCH agent.defaultParams = current overrides),
/quit.
* src/cli/src/commands/create.ts — `create agent` mirroring the llm
pattern. Every yaml-applyable field has a corresponding flag (memory
rule); --default-temperature / --default-top-p / --default-top-k /
--default-max-tokens / --default-seed / --default-stop /
--default-extra / --default-params-file all populate agent.defaultParams.
* src/cli/src/commands/apply.ts — AgentSpecSchema accepts both `llm:
qwen3-thinking` shorthand and `llm: { name: ... }` long form; runs
after llms in the apply order so apiKey/llm references resolve. Round-
trips with `get agent foo -o yaml | apply -f -` (memory rule).
* src/cli/src/commands/get.ts — agentColumns (NAME, LLM, PROJECT,
DESCRIPTION, ID); RESOURCE_KIND mapping for yaml export.
* src/cli/src/commands/shared.ts — `agent`/`agents`/`thread`/`threads`
added to RESOURCE_ALIASES.
* src/cli/src/index.ts — wires createChatCommand into the program; passes
the resolved baseUrl + token so chat can stream SSE without going
through ApiClient (which only does buffered request/response).
* completions/mcpctl.{fish,bash} regenerated. scripts/generate-completions.ts
knows about agents (canonical + aliases) and emits a special-case
`chat)` block that completes the first arg with `mcpctl get agents`
names. tests/completions.test.ts: +9 new assertions covering agents in
the resource list, chat in the commands list, --llm flag for create
agent, agent-name completion for chat, etc.
CLI suite: 430/430 (was 421). Completions --check is clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6ff90a8228 |
feat(mcpd): Llm resource — CRUD + CLI + apply
Why: every client that wants an LLM (the agent, HTTP-mode mcplocal, Claude
Code's STDIO mcplocal) today has to know the provider URL + key, and each
user's ~/.mcpctl/config.json carries them. Centralising the catalogue on the
server is the prerequisite for Phase 2 (mcpd proxies inference so credentials
never leave the cluster).
This phase adds the `Llm` resource and its CRUD surface — no proxy yet, no
client pivot yet. Just enough to register what you have.
Schema:
- New `Llm` model: name/type/model/url/tier/description + {apiKeySecretId,
apiKeySecretKey} FK pair. Reverse `llms` relation on Secret.
- Provider types: anthropic | openai | deepseek | vllm | ollama | gemini-cli.
- Tiers: fast | heavy.
mcpd:
- LlmRepository + LlmService + Zod validation schema + /api/v1/llms routes.
- API surface exposes `apiKeyRef: {name, key}` — the service translates to/
from the FK pair so clients never deal in cuids.
- `resolveApiKey(llmName)` reads through SecretService (which itself dispatches
to the right SecretBackend). That's the hook Phase 2's inference proxy uses.
- RBAC: added `'llms'` to RBAC_RESOURCES + resource alias. Standard
view/create/edit/delete semantics.
- Wired into main.ts (repo, service, routes).
CLI:
- `mcpctl create llm <name> --type X --model Y --tier fast|heavy --api-key-ref SECRET/KEY [--url ...] [--extra k=v ...]`
- `mcpctl get|describe|delete llm` — standard resource verbs.
- `mcpctl apply -f` with `kind: llm` (single- or multi-doc yaml/json).
Applied after secrets, before servers — apiKeyRef resolves an existing Secret.
- Shell completions regenerated.
Tests: 11 service unit tests + 9 route tests (happy path, 404s, 409, validation).
Full suite 1812/1812 (+20 from the 1792 Phase 0 baseline). TypeScript clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
029c3d5f34 |
feat(mcpd): pluggable SecretBackend abstraction + OpenBao driver + migrate
All checks were successful
CI/CD / typecheck (pull_request) Successful in 51s
CI/CD / lint (pull_request) Successful in 1m47s
CI/CD / test (pull_request) Successful in 1m3s
CI/CD / smoke (pull_request) Successful in 4m34s
CI/CD / build (pull_request) Successful in 3m50s
CI/CD / publish (pull_request) Has been skipped
Why: API keys live in Postgres as plaintext JSON. A DB read exposes every credential in the system. Before centralising more secrets (LLM keys, etc.) we want to be able to point at an external KV store and drop DB access to sensitive rows. New model: - `SecretBackend` resource (CRUD + isDefault invariant) owns how a secret is stored. `Secret` gains `backendId` FK and `externalRef`. Reads/writes dispatch through a driver. - `plaintext` driver (near-noop, uses existing Secret.data column) is seeded as the `default` row at startup. Acts as trust root / bootstrap. - `openbao` driver (also HashiCorp Vault KV v2 compatible) talks plain HTTP, no SDK dependency. Auth via static token pulled from a plaintext-backed `Secret` through the injected SecretRefResolver. Caches resolved token. - `SecretMigrateService` moves secrets one-at-a-time: read → write dest → flip row → best-effort source delete. Interrupted runs are idempotent (skips secrets already on destination). CLI surface: - `mcpctl create|get|describe|delete secretbackend` + `--default` on create. - `mcpctl migrate secrets --from X --to Y [--names a,b] [--keep-source] [--dry-run]` - `apply -f` round-trips secretbackends (yaml/json multi-doc + grouped). - RBAC: `secretbackends` resource + `run:migrate-secrets` operation. - Fish + bash completions regenerated. docs/secret-backends.md covers the OpenBao policy, chicken-and-egg auth flow, and the migration semantics. Broke the circular dep (OpenBao needs SecretService to resolve its own token, SecretService needs SecretBackendService) with a deferred-resolver bridge in mcpd startup. 11 new driver unit tests; existing env-resolver/secret-route/ backup tests updated for the new service signatures. Full suite: 1792/1792. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
03827f11e4 |
feat: eager vLLM warmup and smart page titles in paginate stage
- Add warmup() to LlmProvider interface for eager subprocess startup - ManagedVllmProvider.warmup() starts vLLM in background on project load - ProviderRegistry.warmupAll() triggers all managed providers - NamedProvider proxies warmup() to inner provider - paginate stage generates LLM-powered descriptive page titles when available, cached by content hash, falls back to generic "Page N" - project-mcp-endpoint calls warmupAll() on router creation so vLLM is loading while the session initializes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
69867bd47a |
feat: mcpctl v0.0.1 — first public release
Comprehensive MCP server management with kubectl-style CLI. Key features in this release: - Declarative YAML apply/get round-trip with project cloning support - Gated sessions with prompt intelligence for Claude - Interactive MCP console with traffic inspector - Persistent STDIO connections for containerized servers - RBAC with name-scoped bindings - Shell completions (fish + bash) auto-generated - Rate-limit retry with exponential backoff in apply - Project-scoped prompt management - Credential scrubbing from git history Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |