Files
mcpctl/src/mcplocal/tests/smoke/mcp-client.ts

276 lines
9.0 KiB
TypeScript
Raw Normal View History

/**
* Lightweight MCP HTTP client for smoke tests.
* Sends JSON-RPC messages to mcplocal's HTTP endpoint and parses SSE responses.
*/
import http from 'node:http';
fix(smoke,rotator,auth): repair smoke env + close failure modes that caused 27 post-deploy smoke failures This commit lands the durable side of the post-deploy investigation: genuine bugs that let the upstream OpenBao re-init silently break every secret write for 4 days, plus test-code bugs that masked the same breakage in the smoke output. mcpd — fail loud on dead OpenBao tokens ======================================= secret-backend-rotator.service.ts When `mintRoleToken` or `lookupSelf` returns 403/401, classify it as BACKEND_TOKEN_DEAD (likely cause: upstream OpenBao re-init invalidated every pre-existing token), wrap the thrown error with explicit remediation (mint via root + `mcpctl create secret <name> --data <key>=<token> --force`), persist the same message to tokenMeta.lastRotationError, and emit a structured `level:fatal` console.error so it shows up in `kubectl logs deploy/mcpd` with grep- friendly `kind:BACKEND_TOKEN_DEAD`. Adds a `healthCheck(backendId)` method that runs lookup-self without minting — so the boot-time loop can detect the dead-token state immediately, not 24 hours later. secret-backend-rotator-loop.ts Boot-time health check: in `start()`, for every rotatable backend, call `rotator.healthCheck(b.id)` and on failure log a structured fatal entry. This converts the prior silent failure mode (24h wait until scheduled rotation reveals the dead token, with secret writes failing under it the entire time) into "mcpd boots, immediately sees the dead token, alerts loudly". Existing isOverdue path is unchanged. mcpd — Prisma userId crash on /me ================================= routes/auth.ts GET /api/v1/auth/me used `request.userId!` which lied: an authenticated McpToken bearer satisfies the auth middleware but has no associated User row, so userId stayed undefined and `findUnique({ id: undefined })` threw PrismaClientValidationError. Now returns 401 with a clear "service-account/token-bound principal cannot be queried via /me" message instead of bubbling a 500. mcplocal — token revocation propagation ======================================= http/token-auth.ts Lowered default introspection positiveTtl from 30s → 5s. mcpd's introspection endpoint is a single DB lookup; the cache only protects against burst restart storms, not steady-state load. The 30s window let revoked tokens keep working for the full window after revocation (caught by mcptoken.smoke's negative-cache assertion). Aligns with the existing 5s negativeTtl and the test's `wait 7s after revoke` expectation. smoke tests — read URL the same way the CLI does ================================================ mcp-client.ts Adds `loadMcpdAuth()`: URL from `~/.mcpctl/config.json`, token from `~/.mcpctl/credentials`. Critically, the URL does NOT come from credentials. credentials.mcpdUrl carries a stale field for legacy reasons and goes out of sync (left over from old `mcpctl login --mcpd-url localhost:3xxx` invocations) — tests reading it ended up hitting whatever URL the user last logged into rather than the URL the CLI is actually using right now. audit/security/system-prompts smoke now use loadMcpdAuth(), eliminating ~10 cascade failures. Also: switch httpRequest to https.request when scheme is https (matching audit/security/system-prompts/mcp-client/agent helpers). Bumps default callTool timeout from 30s → 60s; many tools that fetch external resources routinely run 10-30s. agent.smoke.test.ts - readToken read from `credentials.json`; the file is `credentials` (no extension). Caused 401 on POST /threads. - `mcpctl get <resource> <name> -o json` returns an array, not a bare object. Round-trip yaml test now indexes [0] before reading description. secretbackend.smoke.test.ts Two genuine assertion-drift fixes (env was right, test was stale): - "lists at least one secretbackend": stop hard-coding the default backend type as 'plaintext'; the invariant is "exactly one default exists". The seeded plaintext is the bootstrap default but operators routinely promote a remote backend (openbao etc.) once it's healthy. - "refuses to delete the seeded default": widen the regex from /default|in use|cannot delete/ to also accept "referenced" — the exact wording has shifted to "is still referenced by N secret(s); migrate them first". audit.test.ts / system-prompts.test.ts / security.test.ts Switch http.request → https.request when URL is https (each had its own copy of the helper). Drop the now-orphan loadMcpdCredentials in favour of loadMcpdAuth from mcp-client.ts. Tests ===== mcpd 759/759, mcplocal 715/715 unit suites still green. Smoke (live): Run 1 (pre-commit, post bao-token rotation): 27 → 12 failures. Run 2 (after fixes-batch, pre-redeploy): 12 → 2 failures. The remaining 2 (mcptoken cache TTL, proxy-pipeline timeout) are what the durable code changes here address; verify after the next redeploy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:35:13 +01:00
import https from 'node:https';
import { readFileSync, existsSync } from 'node:fs';
import { join } from 'node:path';
import { homedir } from 'node:os';
export interface McpResponse {
status: number;
sessionId?: string;
messages: unknown[];
}
const MCPLOCAL_URL = process.env.MCPLOCAL_URL ?? 'http://localhost:3200';
const MCPD_URL = process.env.MCPD_URL ?? 'http://localhost:3100';
export function getMcplocalUrl(): string {
return MCPLOCAL_URL;
}
export function getMcpdUrl(): string {
return MCPD_URL;
}
fix(smoke,rotator,auth): repair smoke env + close failure modes that caused 27 post-deploy smoke failures This commit lands the durable side of the post-deploy investigation: genuine bugs that let the upstream OpenBao re-init silently break every secret write for 4 days, plus test-code bugs that masked the same breakage in the smoke output. mcpd — fail loud on dead OpenBao tokens ======================================= secret-backend-rotator.service.ts When `mintRoleToken` or `lookupSelf` returns 403/401, classify it as BACKEND_TOKEN_DEAD (likely cause: upstream OpenBao re-init invalidated every pre-existing token), wrap the thrown error with explicit remediation (mint via root + `mcpctl create secret <name> --data <key>=<token> --force`), persist the same message to tokenMeta.lastRotationError, and emit a structured `level:fatal` console.error so it shows up in `kubectl logs deploy/mcpd` with grep- friendly `kind:BACKEND_TOKEN_DEAD`. Adds a `healthCheck(backendId)` method that runs lookup-self without minting — so the boot-time loop can detect the dead-token state immediately, not 24 hours later. secret-backend-rotator-loop.ts Boot-time health check: in `start()`, for every rotatable backend, call `rotator.healthCheck(b.id)` and on failure log a structured fatal entry. This converts the prior silent failure mode (24h wait until scheduled rotation reveals the dead token, with secret writes failing under it the entire time) into "mcpd boots, immediately sees the dead token, alerts loudly". Existing isOverdue path is unchanged. mcpd — Prisma userId crash on /me ================================= routes/auth.ts GET /api/v1/auth/me used `request.userId!` which lied: an authenticated McpToken bearer satisfies the auth middleware but has no associated User row, so userId stayed undefined and `findUnique({ id: undefined })` threw PrismaClientValidationError. Now returns 401 with a clear "service-account/token-bound principal cannot be queried via /me" message instead of bubbling a 500. mcplocal — token revocation propagation ======================================= http/token-auth.ts Lowered default introspection positiveTtl from 30s → 5s. mcpd's introspection endpoint is a single DB lookup; the cache only protects against burst restart storms, not steady-state load. The 30s window let revoked tokens keep working for the full window after revocation (caught by mcptoken.smoke's negative-cache assertion). Aligns with the existing 5s negativeTtl and the test's `wait 7s after revoke` expectation. smoke tests — read URL the same way the CLI does ================================================ mcp-client.ts Adds `loadMcpdAuth()`: URL from `~/.mcpctl/config.json`, token from `~/.mcpctl/credentials`. Critically, the URL does NOT come from credentials. credentials.mcpdUrl carries a stale field for legacy reasons and goes out of sync (left over from old `mcpctl login --mcpd-url localhost:3xxx` invocations) — tests reading it ended up hitting whatever URL the user last logged into rather than the URL the CLI is actually using right now. audit/security/system-prompts smoke now use loadMcpdAuth(), eliminating ~10 cascade failures. Also: switch httpRequest to https.request when scheme is https (matching audit/security/system-prompts/mcp-client/agent helpers). Bumps default callTool timeout from 30s → 60s; many tools that fetch external resources routinely run 10-30s. agent.smoke.test.ts - readToken read from `credentials.json`; the file is `credentials` (no extension). Caused 401 on POST /threads. - `mcpctl get <resource> <name> -o json` returns an array, not a bare object. Round-trip yaml test now indexes [0] before reading description. secretbackend.smoke.test.ts Two genuine assertion-drift fixes (env was right, test was stale): - "lists at least one secretbackend": stop hard-coding the default backend type as 'plaintext'; the invariant is "exactly one default exists". The seeded plaintext is the bootstrap default but operators routinely promote a remote backend (openbao etc.) once it's healthy. - "refuses to delete the seeded default": widen the regex from /default|in use|cannot delete/ to also accept "referenced" — the exact wording has shifted to "is still referenced by N secret(s); migrate them first". audit.test.ts / system-prompts.test.ts / security.test.ts Switch http.request → https.request when URL is https (each had its own copy of the helper). Drop the now-orphan loadMcpdCredentials in favour of loadMcpdAuth from mcp-client.ts. Tests ===== mcpd 759/759, mcplocal 715/715 unit suites still green. Smoke (live): Run 1 (pre-commit, post bao-token rotation): 27 → 12 failures. Run 2 (after fixes-batch, pre-redeploy): 12 → 2 failures. The remaining 2 (mcptoken cache TTL, proxy-pipeline timeout) are what the durable code changes here address; verify after the next redeploy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:35:13 +01:00
/**
* Resolve the live mcpd `{ token, url }` the way the CLI itself does:
* - URL from `~/.mcpctl/config.json`'s `mcpdUrl` (with $MCPD_URL override)
* - token from `~/.mcpctl/credentials`'s `token` field
*
* Critically, **the URL does NOT come from credentials**. credentials carries
* an `mcpdUrl` field for legacy reasons that goes stale (left over from old
* `mcpctl login --mcpd-url localhost:3xxx` invocations). Tests that read the
* URL from credentials end up hitting whatever URL the user last logged into,
* not the URL the CLI is actually using right now.
*/
export function loadMcpdAuth(): { token: string; url: string } {
const url = readConfigMcpdUrl() ?? MCPD_URL;
const token = readCredentialsToken() ?? '';
return { token, url };
}
function readConfigMcpdUrl(): string | null {
const path = join(homedir(), '.mcpctl', 'config.json');
if (!existsSync(path)) return null;
try {
const parsed = JSON.parse(readFileSync(path, 'utf-8')) as { mcpdUrl?: string };
return typeof parsed.mcpdUrl === 'string' && parsed.mcpdUrl.length > 0 ? parsed.mcpdUrl : null;
} catch {
return null;
}
}
function readCredentialsToken(): string | null {
const path = join(homedir(), '.mcpctl', 'credentials');
if (!existsSync(path)) return null;
try {
const parsed = JSON.parse(readFileSync(path, 'utf-8')) as { token?: string };
return typeof parsed.token === 'string' && parsed.token.length > 0 ? parsed.token : null;
} catch {
return null;
}
}
function httpRequest(opts: {
url: string;
method: string;
headers?: Record<string, string>;
body?: string;
timeout?: number;
}): Promise<{ status: number; headers: http.IncomingHttpHeaders; body: string }> {
return new Promise((resolve, reject) => {
const parsed = new URL(opts.url);
fix(smoke,rotator,auth): repair smoke env + close failure modes that caused 27 post-deploy smoke failures This commit lands the durable side of the post-deploy investigation: genuine bugs that let the upstream OpenBao re-init silently break every secret write for 4 days, plus test-code bugs that masked the same breakage in the smoke output. mcpd — fail loud on dead OpenBao tokens ======================================= secret-backend-rotator.service.ts When `mintRoleToken` or `lookupSelf` returns 403/401, classify it as BACKEND_TOKEN_DEAD (likely cause: upstream OpenBao re-init invalidated every pre-existing token), wrap the thrown error with explicit remediation (mint via root + `mcpctl create secret <name> --data <key>=<token> --force`), persist the same message to tokenMeta.lastRotationError, and emit a structured `level:fatal` console.error so it shows up in `kubectl logs deploy/mcpd` with grep- friendly `kind:BACKEND_TOKEN_DEAD`. Adds a `healthCheck(backendId)` method that runs lookup-self without minting — so the boot-time loop can detect the dead-token state immediately, not 24 hours later. secret-backend-rotator-loop.ts Boot-time health check: in `start()`, for every rotatable backend, call `rotator.healthCheck(b.id)` and on failure log a structured fatal entry. This converts the prior silent failure mode (24h wait until scheduled rotation reveals the dead token, with secret writes failing under it the entire time) into "mcpd boots, immediately sees the dead token, alerts loudly". Existing isOverdue path is unchanged. mcpd — Prisma userId crash on /me ================================= routes/auth.ts GET /api/v1/auth/me used `request.userId!` which lied: an authenticated McpToken bearer satisfies the auth middleware but has no associated User row, so userId stayed undefined and `findUnique({ id: undefined })` threw PrismaClientValidationError. Now returns 401 with a clear "service-account/token-bound principal cannot be queried via /me" message instead of bubbling a 500. mcplocal — token revocation propagation ======================================= http/token-auth.ts Lowered default introspection positiveTtl from 30s → 5s. mcpd's introspection endpoint is a single DB lookup; the cache only protects against burst restart storms, not steady-state load. The 30s window let revoked tokens keep working for the full window after revocation (caught by mcptoken.smoke's negative-cache assertion). Aligns with the existing 5s negativeTtl and the test's `wait 7s after revoke` expectation. smoke tests — read URL the same way the CLI does ================================================ mcp-client.ts Adds `loadMcpdAuth()`: URL from `~/.mcpctl/config.json`, token from `~/.mcpctl/credentials`. Critically, the URL does NOT come from credentials. credentials.mcpdUrl carries a stale field for legacy reasons and goes out of sync (left over from old `mcpctl login --mcpd-url localhost:3xxx` invocations) — tests reading it ended up hitting whatever URL the user last logged into rather than the URL the CLI is actually using right now. audit/security/system-prompts smoke now use loadMcpdAuth(), eliminating ~10 cascade failures. Also: switch httpRequest to https.request when scheme is https (matching audit/security/system-prompts/mcp-client/agent helpers). Bumps default callTool timeout from 30s → 60s; many tools that fetch external resources routinely run 10-30s. agent.smoke.test.ts - readToken read from `credentials.json`; the file is `credentials` (no extension). Caused 401 on POST /threads. - `mcpctl get <resource> <name> -o json` returns an array, not a bare object. Round-trip yaml test now indexes [0] before reading description. secretbackend.smoke.test.ts Two genuine assertion-drift fixes (env was right, test was stale): - "lists at least one secretbackend": stop hard-coding the default backend type as 'plaintext'; the invariant is "exactly one default exists". The seeded plaintext is the bootstrap default but operators routinely promote a remote backend (openbao etc.) once it's healthy. - "refuses to delete the seeded default": widen the regex from /default|in use|cannot delete/ to also accept "referenced" — the exact wording has shifted to "is still referenced by N secret(s); migrate them first". audit.test.ts / system-prompts.test.ts / security.test.ts Switch http.request → https.request when URL is https (each had its own copy of the helper). Drop the now-orphan loadMcpdCredentials in favour of loadMcpdAuth from mcp-client.ts. Tests ===== mcpd 759/759, mcplocal 715/715 unit suites still green. Smoke (live): Run 1 (pre-commit, post bao-token rotation): 27 → 12 failures. Run 2 (after fixes-batch, pre-redeploy): 12 → 2 failures. The remaining 2 (mcptoken cache TTL, proxy-pipeline timeout) are what the durable code changes here address; verify after the next redeploy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:35:13 +01:00
const driver = parsed.protocol === 'https:' ? https : http;
const req = driver.request(
{
hostname: parsed.hostname,
fix(smoke,rotator,auth): repair smoke env + close failure modes that caused 27 post-deploy smoke failures This commit lands the durable side of the post-deploy investigation: genuine bugs that let the upstream OpenBao re-init silently break every secret write for 4 days, plus test-code bugs that masked the same breakage in the smoke output. mcpd — fail loud on dead OpenBao tokens ======================================= secret-backend-rotator.service.ts When `mintRoleToken` or `lookupSelf` returns 403/401, classify it as BACKEND_TOKEN_DEAD (likely cause: upstream OpenBao re-init invalidated every pre-existing token), wrap the thrown error with explicit remediation (mint via root + `mcpctl create secret <name> --data <key>=<token> --force`), persist the same message to tokenMeta.lastRotationError, and emit a structured `level:fatal` console.error so it shows up in `kubectl logs deploy/mcpd` with grep- friendly `kind:BACKEND_TOKEN_DEAD`. Adds a `healthCheck(backendId)` method that runs lookup-self without minting — so the boot-time loop can detect the dead-token state immediately, not 24 hours later. secret-backend-rotator-loop.ts Boot-time health check: in `start()`, for every rotatable backend, call `rotator.healthCheck(b.id)` and on failure log a structured fatal entry. This converts the prior silent failure mode (24h wait until scheduled rotation reveals the dead token, with secret writes failing under it the entire time) into "mcpd boots, immediately sees the dead token, alerts loudly". Existing isOverdue path is unchanged. mcpd — Prisma userId crash on /me ================================= routes/auth.ts GET /api/v1/auth/me used `request.userId!` which lied: an authenticated McpToken bearer satisfies the auth middleware but has no associated User row, so userId stayed undefined and `findUnique({ id: undefined })` threw PrismaClientValidationError. Now returns 401 with a clear "service-account/token-bound principal cannot be queried via /me" message instead of bubbling a 500. mcplocal — token revocation propagation ======================================= http/token-auth.ts Lowered default introspection positiveTtl from 30s → 5s. mcpd's introspection endpoint is a single DB lookup; the cache only protects against burst restart storms, not steady-state load. The 30s window let revoked tokens keep working for the full window after revocation (caught by mcptoken.smoke's negative-cache assertion). Aligns with the existing 5s negativeTtl and the test's `wait 7s after revoke` expectation. smoke tests — read URL the same way the CLI does ================================================ mcp-client.ts Adds `loadMcpdAuth()`: URL from `~/.mcpctl/config.json`, token from `~/.mcpctl/credentials`. Critically, the URL does NOT come from credentials. credentials.mcpdUrl carries a stale field for legacy reasons and goes out of sync (left over from old `mcpctl login --mcpd-url localhost:3xxx` invocations) — tests reading it ended up hitting whatever URL the user last logged into rather than the URL the CLI is actually using right now. audit/security/system-prompts smoke now use loadMcpdAuth(), eliminating ~10 cascade failures. Also: switch httpRequest to https.request when scheme is https (matching audit/security/system-prompts/mcp-client/agent helpers). Bumps default callTool timeout from 30s → 60s; many tools that fetch external resources routinely run 10-30s. agent.smoke.test.ts - readToken read from `credentials.json`; the file is `credentials` (no extension). Caused 401 on POST /threads. - `mcpctl get <resource> <name> -o json` returns an array, not a bare object. Round-trip yaml test now indexes [0] before reading description. secretbackend.smoke.test.ts Two genuine assertion-drift fixes (env was right, test was stale): - "lists at least one secretbackend": stop hard-coding the default backend type as 'plaintext'; the invariant is "exactly one default exists". The seeded plaintext is the bootstrap default but operators routinely promote a remote backend (openbao etc.) once it's healthy. - "refuses to delete the seeded default": widen the regex from /default|in use|cannot delete/ to also accept "referenced" — the exact wording has shifted to "is still referenced by N secret(s); migrate them first". audit.test.ts / system-prompts.test.ts / security.test.ts Switch http.request → https.request when URL is https (each had its own copy of the helper). Drop the now-orphan loadMcpdCredentials in favour of loadMcpdAuth from mcp-client.ts. Tests ===== mcpd 759/759, mcplocal 715/715 unit suites still green. Smoke (live): Run 1 (pre-commit, post bao-token rotation): 27 → 12 failures. Run 2 (after fixes-batch, pre-redeploy): 12 → 2 failures. The remaining 2 (mcptoken cache TTL, proxy-pipeline timeout) are what the durable code changes here address; verify after the next redeploy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:35:13 +01:00
port: parsed.port || (parsed.protocol === 'https:' ? 443 : 80),
path: parsed.pathname + parsed.search,
method: opts.method,
headers: opts.headers,
timeout: opts.timeout ?? 30_000,
},
(res) => {
const chunks: Buffer[] = [];
res.on('data', (chunk: Buffer) => chunks.push(chunk));
res.on('end', () => {
resolve({
status: res.statusCode ?? 0,
headers: res.headers,
body: Buffer.concat(chunks).toString('utf-8'),
});
});
},
);
req.on('error', reject);
req.on('timeout', () => {
req.destroy();
reject(new Error('Request timed out'));
});
if (opts.body) req.write(opts.body);
req.end();
});
}
function parseSSE(body: string): unknown[] {
const messages: unknown[] = [];
for (const line of body.split('\n')) {
if (line.startsWith('data: ')) {
try {
messages.push(JSON.parse(line.slice(6)));
} catch {
// skip
}
}
}
return messages;
}
/**
* MCP session for smoke tests.
* Manages session ID and sends JSON-RPC requests.
*/
export class SmokeMcpSession {
private sessionId?: string;
private nextId = 1;
constructor(
private readonly projectName: string,
private readonly token?: string,
) {}
get endpoint(): string {
return `${MCPLOCAL_URL}/projects/${encodeURIComponent(this.projectName)}/mcp`;
}
async send(method: string, params: Record<string, unknown> = {}, timeout?: number): Promise<unknown> {
const id = this.nextId++;
const request = { jsonrpc: '2.0', id, method, params };
const headers: Record<string, string> = {
'Content-Type': 'application/json',
'Accept': 'application/json, text/event-stream',
};
if (this.sessionId) headers['mcp-session-id'] = this.sessionId;
if (this.token) headers['Authorization'] = `Bearer ${this.token}`;
const result = await httpRequest({
url: this.endpoint,
method: 'POST',
headers,
body: JSON.stringify(request),
timeout,
});
// Capture session ID
if (!this.sessionId) {
const sid = result.headers['mcp-session-id'];
if (typeof sid === 'string') this.sessionId = sid;
}
// Handle HTTP-level errors (e.g. 502 for nonexistent project)
if (result.status >= 400) {
let errorMsg = `HTTP ${result.status}`;
try {
const body = JSON.parse(result.body) as { error?: string };
if (body.error) errorMsg = body.error;
} catch {
errorMsg = `HTTP ${result.status}: ${result.body.slice(0, 200)}`;
}
throw new Error(errorMsg);
}
// Parse response — handle SSE with multiple messages (notifications + response)
const messages = result.headers['content-type']?.includes('text/event-stream')
? parseSSE(result.body)
: [JSON.parse(result.body)];
// Find the response matching our request ID (skip notifications)
const response = messages.find((m) => {
const msg = m as { id?: unknown };
return msg.id === id;
}) as { result?: unknown; error?: { code: number; message: string } } | undefined;
// Fall back to first message if no ID match (e.g. error responses)
const parsed = response ?? messages[0] as { result?: unknown; error?: { code: number; message: string } } | undefined;
if (!parsed) throw new Error(`No response for ${method}`);
if (parsed.error) throw new Error(`MCP error ${parsed.error.code}: ${parsed.error.message}`);
return parsed.result;
}
async initialize(): Promise<unknown> {
return this.send('initialize', {
protocolVersion: '2024-11-05',
capabilities: {},
clientInfo: { name: 'mcpctl-smoke-test', version: '1.0.0' },
});
}
async sendNotification(method: string, params: Record<string, unknown> = {}): Promise<void> {
const notification = { jsonrpc: '2.0', method, params };
const headers: Record<string, string> = {
'Content-Type': 'application/json',
'Accept': 'application/json, text/event-stream',
};
if (this.sessionId) headers['mcp-session-id'] = this.sessionId;
if (this.token) headers['Authorization'] = `Bearer ${this.token}`;
await httpRequest({
url: this.endpoint,
method: 'POST',
headers,
body: JSON.stringify(notification),
}).catch(() => {});
}
async listTools(): Promise<Array<{ name: string; description?: string; inputSchema?: unknown }>> {
const result = await this.send('tools/list') as { tools: Array<{ name: string; description?: string; inputSchema?: unknown }> };
return result.tools ?? [];
}
async callTool(name: string, args: Record<string, unknown> = {}, timeout?: number): Promise<{ content: Array<{ type: string; text?: string }>; isError?: boolean }> {
fix(smoke,rotator,auth): repair smoke env + close failure modes that caused 27 post-deploy smoke failures This commit lands the durable side of the post-deploy investigation: genuine bugs that let the upstream OpenBao re-init silently break every secret write for 4 days, plus test-code bugs that masked the same breakage in the smoke output. mcpd — fail loud on dead OpenBao tokens ======================================= secret-backend-rotator.service.ts When `mintRoleToken` or `lookupSelf` returns 403/401, classify it as BACKEND_TOKEN_DEAD (likely cause: upstream OpenBao re-init invalidated every pre-existing token), wrap the thrown error with explicit remediation (mint via root + `mcpctl create secret <name> --data <key>=<token> --force`), persist the same message to tokenMeta.lastRotationError, and emit a structured `level:fatal` console.error so it shows up in `kubectl logs deploy/mcpd` with grep- friendly `kind:BACKEND_TOKEN_DEAD`. Adds a `healthCheck(backendId)` method that runs lookup-self without minting — so the boot-time loop can detect the dead-token state immediately, not 24 hours later. secret-backend-rotator-loop.ts Boot-time health check: in `start()`, for every rotatable backend, call `rotator.healthCheck(b.id)` and on failure log a structured fatal entry. This converts the prior silent failure mode (24h wait until scheduled rotation reveals the dead token, with secret writes failing under it the entire time) into "mcpd boots, immediately sees the dead token, alerts loudly". Existing isOverdue path is unchanged. mcpd — Prisma userId crash on /me ================================= routes/auth.ts GET /api/v1/auth/me used `request.userId!` which lied: an authenticated McpToken bearer satisfies the auth middleware but has no associated User row, so userId stayed undefined and `findUnique({ id: undefined })` threw PrismaClientValidationError. Now returns 401 with a clear "service-account/token-bound principal cannot be queried via /me" message instead of bubbling a 500. mcplocal — token revocation propagation ======================================= http/token-auth.ts Lowered default introspection positiveTtl from 30s → 5s. mcpd's introspection endpoint is a single DB lookup; the cache only protects against burst restart storms, not steady-state load. The 30s window let revoked tokens keep working for the full window after revocation (caught by mcptoken.smoke's negative-cache assertion). Aligns with the existing 5s negativeTtl and the test's `wait 7s after revoke` expectation. smoke tests — read URL the same way the CLI does ================================================ mcp-client.ts Adds `loadMcpdAuth()`: URL from `~/.mcpctl/config.json`, token from `~/.mcpctl/credentials`. Critically, the URL does NOT come from credentials. credentials.mcpdUrl carries a stale field for legacy reasons and goes out of sync (left over from old `mcpctl login --mcpd-url localhost:3xxx` invocations) — tests reading it ended up hitting whatever URL the user last logged into rather than the URL the CLI is actually using right now. audit/security/system-prompts smoke now use loadMcpdAuth(), eliminating ~10 cascade failures. Also: switch httpRequest to https.request when scheme is https (matching audit/security/system-prompts/mcp-client/agent helpers). Bumps default callTool timeout from 30s → 60s; many tools that fetch external resources routinely run 10-30s. agent.smoke.test.ts - readToken read from `credentials.json`; the file is `credentials` (no extension). Caused 401 on POST /threads. - `mcpctl get <resource> <name> -o json` returns an array, not a bare object. Round-trip yaml test now indexes [0] before reading description. secretbackend.smoke.test.ts Two genuine assertion-drift fixes (env was right, test was stale): - "lists at least one secretbackend": stop hard-coding the default backend type as 'plaintext'; the invariant is "exactly one default exists". The seeded plaintext is the bootstrap default but operators routinely promote a remote backend (openbao etc.) once it's healthy. - "refuses to delete the seeded default": widen the regex from /default|in use|cannot delete/ to also accept "referenced" — the exact wording has shifted to "is still referenced by N secret(s); migrate them first". audit.test.ts / system-prompts.test.ts / security.test.ts Switch http.request → https.request when URL is https (each had its own copy of the helper). Drop the now-orphan loadMcpdCredentials in favour of loadMcpdAuth from mcp-client.ts. Tests ===== mcpd 759/759, mcplocal 715/715 unit suites still green. Smoke (live): Run 1 (pre-commit, post bao-token rotation): 27 → 12 failures. Run 2 (after fixes-batch, pre-redeploy): 12 → 2 failures. The remaining 2 (mcptoken cache TTL, proxy-pipeline timeout) are what the durable code changes here address; verify after the next redeploy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:35:13 +01:00
// Default 60s — many real MCP tools (web fetch, doc retrieval, query
// execution) routinely take 10-30s under normal load. The previous 30s
// floor was tight enough that occasional upstream latency tripped the
// proxy-pipeline hot-reload smoke. Tests that need a tighter bound can
// pass an explicit value.
return await this.send('tools/call', { name, arguments: args }, timeout ?? 60_000) as { content: Array<{ type: string; text?: string }>; isError?: boolean };
}
async close(): Promise<void> {
if (this.sessionId) {
const headers: Record<string, string> = { 'mcp-session-id': this.sessionId };
if (this.token) headers['Authorization'] = `Bearer ${this.token}`;
await httpRequest({
url: this.endpoint,
method: 'DELETE',
headers,
timeout: 5_000,
}).catch(() => {});
this.sessionId = undefined;
}
}
}
/**
* Check if mcplocal is reachable.
*/
export async function isMcplocalRunning(): Promise<boolean> {
try {
const result = await httpRequest({
url: `${MCPLOCAL_URL}/health`,
method: 'GET',
timeout: 3_000,
});
return result.status < 500;
} catch {
return false;
}
}
/**
* Run an mcpctl CLI command and return stdout.
*/
export function mcpctl(args: string): Promise<string> {
const { execSync } = require('node:child_process') as typeof import('node:child_process');
try {
return Promise.resolve(execSync(`mcpctl ${args}`, { encoding: 'utf-8', timeout: 30_000 }).trim());
} catch (err) {
const e = err as { stderr?: string; stdout?: string };
return Promise.reject(new Error(e.stderr ?? e.stdout ?? String(err)));
}
}