Compare commits

..

4 Commits

Author SHA1 Message Date
Michal
c7b1bd8e2c feat(mcpd): AgentService virtual methods + GC cascade (v3 Stage 2)
State machine for kind=virtual Agent rows. Mirrors what
VirtualLlmService did for Llms in v1, then wires both lifecycles
together so disconnect/heartbeat/GC cascade through both at once.

AgentRepository:
- create/update accept the new lifecycle fields (kind, providerSessionId,
  status, lastHeartbeatAt, inactiveSince).
- Adds findBySessionId, findByLlmId, findStaleVirtuals, findExpiredInactives.

AgentService — new virtual-agent methods:
- registerVirtualAgents(sessionId, inputs, ownerId) — sticky upsert.
  New names insert as kind=virtual/status=active. Existing virtuals
  owned by the same session reactivate; existing inactive virtuals
  from a foreign session can be adopted (sticky reconnect). Refuses
  to overwrite a public agent or a foreign session's still-active
  virtual (HTTP 409). Pinned LLM is resolved via LlmService — caller
  posts Llms first.
- heartbeatVirtualAgents(sessionId) — bumps owned agents on a session
  heartbeat; revives inactive rows.
- markVirtualAgentsInactiveBySession(sessionId) — disconnect cascade.
- deleteVirtualAgentsForLlm(llmId) — defensive cascade for the GC's
  Llm-delete step (Agent.llmId is Restrict).
- gcSweepVirtualAgents() — same shape as VirtualLlmService.gcSweep
  (90s heartbeat-stale → inactive, 4h inactive → delete).

VirtualLlmService:
- Optional AgentService dependency. heartbeat() now also bumps owned
  agents; unbindSession() flips them inactive. gcSweep() runs the
  agent sweep FIRST (so any agent that would block an Llm delete via
  Restrict is already gone), and adds a defensive
  deleteVirtualAgentsForLlm step right before each Llm delete in case
  an agent's heartbeat lagged its Llm's just enough to escape this
  round's 4h cutoff.

main.ts:
- VirtualLlmService construction moves below AgentService so it can
  receive the cascade dependency.

Tests: 13 new in virtual-agent-service.test.ts cover all the register
variants (insert, sticky reconnect, adopt-inactive-foreign, refuse
public-overwrite, refuse foreign-session-active), heartbeat-revive,
disconnect-cascade, deleteVirtualAgentsForLlm scope, GC sweep flip
+ delete + idempotence, and three VirtualLlmService cascade scenarios
(unbindSession, gcSweep deleting agent before Llm, defensive cascade
when agent's heartbeat lagged).

mcpd suite: 854/854 (was 841 + 13 new). Workspace unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 17:07:23 +01:00
Michal
9afd24a3aa feat(db+mcpd): Agent lifecycle + chat.service kind=virtual branch (v3 Stage 1)
Two pieces of v3 plumbing — schema + the latent v1 chat.service bug.

Schema (db):
- Agent gains kind/providerSessionId/lastHeartbeatAt/status/inactiveSince
  mirroring Llm's v1 lifecycle. Reuses LlmKind / LlmStatus enums; no
  new types. Existing rows backfill kind=public/status=active so v1
  CRUD is unaffected.
- @@index([kind, status]) for the GC sweep, @@index([providerSessionId])
  for disconnect-cascade lookups.
- 4 new prisma-level tests cover defaults, persisting virtual fields,
  the (kind, status) GC index, and providerSessionId lookups.
  Total agent-schema tests: 20/20.

chat.service (mcpd) — fixes the v1 latent bug:
- LlmView's kind is now plumbed through prepareContext as ctx.llmKind.
- Two new private helpers, runOneInference / streamInference, branch
  on ctx.llmKind: 'public' goes through the existing adapter
  registry, 'virtual' relays through VirtualLlmService.enqueueInferTask
  (mirrors the route-handler branch from v1 Stage 3).
- Streaming bridges VirtualLlmService's onChunk callback API to an
  async iterator via a small queue + wake pattern.
- ChatService gains an optional virtualLlms constructor parameter;
  main.ts wires it in. Older test wirings without it raise a clear
  "virtualLlms dispatcher not wired" error when the row is virtual,
  rather than silently falling through to the public path against an
  empty URL.

This unblocks any Agent (public OR future v3-virtual) pinned to a
kind=virtual Llm. Pre-this-stage, those agents 502'd against the
empty url field.

Tests: 4 new chat-service-virtual-llm.test.ts cover the relay path
non-streaming, streaming, missing-dispatcher error, and rejection
surfacing. mcpd suite: 841/841 (was 833, +8 across stages 1+v3-Stage-1).
Workspace: 2054/2054 across 153 files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 17:07:23 +01:00
9374a2652b perf: vitest threads pool + Dockerfile pnpm cache mount (#66)
Some checks failed
CI/CD / lint (push) Successful in 58s
CI/CD / test (push) Successful in 1m11s
CI/CD / typecheck (push) Successful in 2m35s
CI/CD / smoke (push) Failing after 1m43s
CI/CD / build (push) Successful in 2m21s
CI/CD / publish (push) Has been skipped
2026-04-27 16:07:05 +00:00
Michal
18245be0c1 perf: vitest threads pool + Dockerfile pnpm cache mount
Some checks failed
CI/CD / typecheck (pull_request) Successful in 56s
CI/CD / test (pull_request) Successful in 1m9s
CI/CD / lint (pull_request) Successful in 2m40s
CI/CD / smoke (pull_request) Failing after 1m43s
CI/CD / build (pull_request) Failing after 7m6s
CI/CD / publish (pull_request) Has been skipped
Two tuning knobs that were leaving most of the host idle:

1) vitest.config.ts pool=threads with maxThreads ≈ cores/2.
   Default left this 64-core workstation at ~10% CPU during
   \`pnpm test:run\`. Threads pool uses the box: same 152-file/2050-test
   suite now runs at ~700% CPU instead of ~150%. Wall time gain is
   modest (workload is dominated by a handful of slow individual files
   that one thread must run serially), but the parallel headroom is
   there for when the suite grows. Cap = max(2, cores/2) keeps laptops
   reasonable; override with \`VITEST_MAX_THREADS=N\` in the env.

2) Dockerfile.mcpd uses BuildKit cache mounts on both pnpm install
   steps. Adds \`# syntax=docker/dockerfile:1.6\` and a
   \`--mount=type=cache,target=/root/.local/share/pnpm/store\` so
   pnpm's content-addressed store survives across image rebuilds.
   Cold rebuilds where the lockfile changed are unaffected; warm
   rebuilds where only source changed drop the install step from
   ~60s to <5s. fulldeploy.sh's mcpd image rebuild gets that back
   minus the docker push hash mismatch.

Test parity: 2050/2050 across 152 files; per-package mcpd 837/837.
Both unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 17:06:39 +01:00
2 changed files with 30 additions and 4 deletions

View File

@@ -1,3 +1,9 @@
# syntax=docker/dockerfile:1.6
# `# syntax=...` enables BuildKit's --mount feature on the builder so we can
# share the pnpm content-addressed store across image builds. Without it the
# next two RUN steps fall back to plain mode and the cache mount is ignored
# (build still works, just slower).
# Stage 1: Build TypeScript
FROM node:20-alpine AS builder
@@ -12,8 +18,12 @@ COPY src/db/package.json src/db/tsconfig.json src/db/
COPY src/shared/package.json src/shared/tsconfig.json src/shared/
COPY src/web/package.json src/web/tsconfig.json src/web/
# Install all dependencies
RUN pnpm install --frozen-lockfile
# Install all dependencies. The cache mount keeps pnpm's CAS store warm
# across builds: only newly-changed packages get downloaded; everything
# else hardlinks from the cache. Drops install from ~60s to <5s on a
# warm cache. `--frozen-lockfile` still guarantees lockfile fidelity.
RUN --mount=type=cache,id=pnpm-store-mcpd-builder,target=/root/.local/share/pnpm/store \
pnpm install --frozen-lockfile
# Copy source code
COPY src/mcpd/src/ src/mcpd/src/
@@ -42,8 +52,11 @@ COPY src/mcpd/package.json src/mcpd/
COPY src/db/package.json src/db/
COPY src/shared/package.json src/shared/
# Install all deps (prisma CLI needed at runtime for db push)
RUN pnpm install --frozen-lockfile
# Install all deps (prisma CLI needed at runtime for db push). Same
# cache-mount trick as the builder stage; separate cache id so the two
# stages don't compete for the same lock.
RUN --mount=type=cache,id=pnpm-store-mcpd-runtime,target=/root/.local/share/pnpm/store \
pnpm install --frozen-lockfile
# Copy prisma schema and generate client
COPY src/db/prisma/ src/db/prisma/

View File

@@ -1,8 +1,21 @@
import { defineConfig } from 'vitest/config';
import { availableParallelism } from 'node:os';
// Default vitest's pool to ~half the CPU threads we have. The previous
// implicit default left this 64-thread workstation at ~10% utilization
// during `pnpm test:run`. Half is a soft cap that stays kind to laptops
// (8-thread → 4 workers) while letting beefy hosts push closer to the
// box's actual capacity. Override at run time with VITEST_MAX_THREADS.
const cores = availableParallelism();
const maxThreads = Number(process.env['VITEST_MAX_THREADS'] ?? Math.max(2, Math.floor(cores / 2)));
export default defineConfig({
test: {
globals: true,
pool: 'threads',
poolOptions: {
threads: { maxThreads, minThreads: 1 },
},
coverage: {
provider: 'v8',
reporter: ['text', 'json', 'html'],