Commit Graph

2 Commits

Author SHA1 Message Date
Michal
1f0be8a5c1 fix(agents): close gaps from /gstack-review
P1 — thread reads now enforce ownership
========================================
chat.service.ts / routes/agent-chat.ts
  GET /api/v1/threads/:id/messages was previously RBAC-mapped to
  view:agents (no resourceName scope) with the route comment promising
  "service-level owner check enforces fine-grained access" — but the
  service didn't actually check. Any caller with view:agents could read
  another user's thread by guessing/learning the threadId. CUIDs are
  hard to brute-force but they leak: SSE `final` chunks, agents-plugin
  `_meta.threadId`, and several response bodies surface them. Now
  ChatService.listMessages(threadId, ownerId) loads the thread, returns
  404 (not 403, to avoid id-enumeration via differential status codes)
  if ownerId doesn't match. Regression test in chat-service.test.ts
  covers Alice/Bob isolation + nonexistent-thread same-shape 404.

P2 — AgentChatRequestSchema strict mode
========================================
validation/agent.schema.ts
  `.merge()` does NOT inherit `.strict()` from AgentChatParamsSchema.
  Typo'd fields (e.g. `temprature`) silently fell through and the agent
  silently used the default — debuggable only by reading the LLM call
  payload. Re-applied `.strict()` on the merged schema.

P2 — per-agent maxIterations override + clamp
==============================================
chat.service.ts
  Loop cap was a hard-coded module constant (12), wrong for both
  research-style agents (need higher) and cheap-probe agents (could opt
  lower). Now reads `agent.extras.maxIterations`, clamps 1..50, falls
  back to 12 default. The clamp is the soft-DoS guard: a hostile agent
  definition with `maxIterations:1000000` can't burn unbounded LLM calls
  per request. Both chat() and chatStream() use ctx.maxIterations now.
  Regression test covers low-cap override (rejects with `exceeded 2`)
  and hostile-value clamp (rejects with `exceeded 50`).

P3 — SSE write to closed socket
================================
routes/agent-chat.ts
  When the upstream adapter throws after some chunks were already
  written AND the client disconnected, the catch block tried to flush
  more chunks to a closed socket. Without an `on('error')` handler
  Node emits unhandled error events; once Pino is wired to alerts
  this'd page on every disconnect-mid-stream. writeSseChunk now
  checks `reply.raw.destroyed || writableEnded` before write.

P3 — BACKEND_TOKEN_DEAD preserves original stack
=================================================
services/secret-backend-rotator.service.ts
  When wrapping mintRoleToken/lookupSelf failures as
  BACKEND_TOKEN_DEAD, the new Error() discarded the original throw —
  hard to tell whether the inner failure was a network blip vs an
  OpenBao API mismatch vs DNS. Now uses `new Error(msg, { cause: err })`
  so the inner stack survives.

P3 — .gitignore .claude/scheduled_tasks.lock
=============================================
This persisted state file was leaking into every `git status`.

Tests
=====
mcpd 761/761 (+2 regression tests). mcplocal 715/715. cli 430/430.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 23:53:19 +01:00
Michal
eda8e79712 feat(agents): mcpd repos + Agent/Chat services with tool-use loop (Stage 2)
Layers the persistence-side logic on top of the Stage 1 schema. AgentService
mirrors LlmService's CRUD shape with name-resolved llm/project references and
yaml round-trip support; ChatService is the orchestrator that drives one chat
turn end-to-end: build the merged system block (agent.systemPrompt + project
Prompts ordered by priority desc + per-call systemAppend), persist the user
turn, run the adapter, dispatch any tool_calls through an injected
ChatToolDispatcher, persist tool turns linked back via toolCallId, and loop
until the model returns terminal text.

Per-call params resolve LiteLLM-style: request body → agent.defaultParams →
adapter default. The escape hatch `extra` is forwarded as-is so each adapter
can cherry-pick provider-specific knobs (Anthropic metadata, vLLM
repetition_penalty, etc.) without code changes here.

Persistence is non-transactional across the loop because tool calls can take
minutes; long-held DB transactions would starve other writers. Instead each
in-flight assistant turn is written `pending` and flipped to `complete` only
after its tool results land. On failure or max-iter overrun, every `pending`
row in the thread is flipped to `error` so the trail is auditable.

Tools are namespaced on the wire as `<server>__<tool>`, unmarshalled at
dispatch time; `tools_allowlist` filters before the model sees the list.

Tests:
  agent-service.test.ts (7) — CRUD with name-resolved llm/project, conflict
    on duplicate, llm switch, project detach, listByProject filtering,
    upsertByName branch coverage.
  chat-service.test.ts (9) — plain text turn, full text→tool→text loop with
    toolCallId linkage, max-iter cap leaves zero pending, adapter-throws
    leaves zero pending, body→defaultParams merge, `extra` passthrough,
    project-Prompt priority ordering in the system block, tool-without-
    project rejection, tools_allowlist filtering.

All 16 green; full mcpd suite still 737/737.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 16:38:38 +01:00