feat(cli+docs+smoke): inference-task CLI + GC ticker + smoke + docs (v5 Stage 4)
Some checks failed
CI/CD / lint (pull_request) Successful in 55s
CI/CD / test (pull_request) Successful in 1m12s
CI/CD / typecheck (pull_request) Successful in 2m46s
CI/CD / smoke (pull_request) Failing after 1m44s
CI/CD / build (pull_request) Failing after 7m0s
CI/CD / publish (pull_request) Has been skipped
Some checks failed
CI/CD / lint (pull_request) Successful in 55s
CI/CD / test (pull_request) Successful in 1m12s
CI/CD / typecheck (pull_request) Successful in 2m46s
CI/CD / smoke (pull_request) Failing after 1m44s
CI/CD / build (pull_request) Failing after 7m0s
CI/CD / publish (pull_request) Has been skipped
CLI surface for the durable queue:
- `mcpctl get tasks` — table view (ID, STATUS, POOL, LLM, MODEL,
STREAM, AGE, WORKER). Aliases `task`, `tasks`, `inference-task`,
`inference-tasks` all normalize to the canonical plural so URL
construction works uniformly. RESOURCE_ALIASES + completions
generator updated.
- `mcpctl chat-llm <name> --async -m <msg>` — enqueue and exit. stdout
is just the task id (pipeable into `xargs mcpctl get task`); stderr
carries human-readable status. REPL mode is rejected for --async
(fire-and-forget doesn't make sense without -m).
GC ticker in mcpd: 5-min interval. Pending tasks past 1 h queue
timeout flip to error with a clear message; terminal tasks past 7 d
retention get deleted. Both queries are index-backed.
Crash fix uncovered by the smoke: when the async route doesn't await
ref.done, a later cancel/error rejected the in-flight Promise as
unhandled and crashed mcpd. The route now attaches a no-op `.catch`
so the legacy `done` semantic still works for sync callers (chat,
direct infer) without taking out the process for async ones. The
EnqueueInferOptions also gained an explicit `ownerId` field so the
async API can stamp the authenticated user on the row instead of
inheriting 'system' from the constructor's resolveOwner — without
this, every GET/DELETE from the original caller would 404 due to
foreign-owner mismatch.
Smoke (tests/smoke/inference-task.smoke.test.ts):
1. POST /inference-tasks while no worker bound → row=pending.
2. Bring a registrar online → bindSession drain claims and
dispatches → worker complete()s → row=completed → GET returns
the assistant body.
3. Stop worker, enqueue, DELETE → row=cancelled, persisted.
docs/inference-tasks.md (new): full data model, lifecycle diagram,
async API reference, CLI examples, RBAC table, GC defaults, and the
v5 limitations / v6 roadmap. Cross-linked from virtual-llms.md and
agents.md.
Tests + smoke: mcpd 893/893, mcplocal 723/723, cli 437/437, full
smoke 146/146 (was 144, +2 new task smoke). Live mcpd verified via
manual curl: enqueue → cancel → re-fetch — no crash, owner scoping
returns 404 on foreign ids, GC ticker logs at info when it sweeps.
v5 complete: durable queue (Stage 1) + VirtualLlmService rewire
(Stage 2) + async API & RBAC (Stage 3) + CLI/GC/smoke/docs (Stage 4).
This commit is contained in:
@@ -186,7 +186,7 @@ async function extractTree(): Promise<CmdInfo> {
|
||||
const CANONICAL_RESOURCES = [
|
||||
'servers', 'instances', 'secrets', 'secretbackends', 'llms', 'agents', 'personalities', 'templates', 'projects',
|
||||
'users', 'groups', 'rbac', 'prompts', 'promptrequests',
|
||||
'serverattachments', 'proxymodels', 'all',
|
||||
'serverattachments', 'proxymodels', 'inference-tasks', 'all',
|
||||
];
|
||||
|
||||
const ALIAS_ENTRIES: [string, string][] = [
|
||||
@@ -206,6 +206,10 @@ const ALIAS_ENTRIES: [string, string][] = [
|
||||
['promptrequest', 'promptrequests'], ['promptrequests', 'promptrequests'], ['pr', 'promptrequests'],
|
||||
['serverattachment', 'serverattachments'], ['serverattachments', 'serverattachments'], ['sa', 'serverattachments'],
|
||||
['proxymodel', 'proxymodels'], ['proxymodels', 'proxymodels'], ['pm', 'proxymodels'],
|
||||
// v5: inference-task queue. Short forms (`task`, `tasks`) are what
|
||||
// operators actually type — `mcpctl get tasks`.
|
||||
['task', 'inference-tasks'], ['tasks', 'inference-tasks'],
|
||||
['inference-task', 'inference-tasks'], ['inference-tasks', 'inference-tasks'],
|
||||
['all', 'all'],
|
||||
];
|
||||
|
||||
|
||||
Reference in New Issue
Block a user