Files
mcpctl/.taskmaster/tasks/task_013.md
2026-02-21 03:10:39 +00:00

196 lines
13 KiB
Markdown

# Task ID: 13
**Title:** Implement MCP Request/Response Filtering Logic
**Status:** pending
**Dependencies:** 11, 12
**Priority:** medium
**Description:** Create the intelligent filtering system that analyzes Claude's intent and filters MCP responses to minimize token usage while maximizing relevance.
**Details:**
Create filtering logic:
```typescript
// services/filter-engine.ts
export class FilterEngine {
constructor(private llm: LLMProvider) {}
// Analyze Claude's request to understand intent
async analyzeIntent(request: ToolCallRequest): Promise<IntentAnalysis> {
const prompt = `Analyze this MCP tool call to understand the user's intent:
Tool: ${request.toolName}
Arguments: ${JSON.stringify(request.arguments)}
Output JSON:
{
"intent": "description of what user wants",
"keywords": ["relevant", "keywords"],
"filters": { "date_range": "...", "categories": [...] },
"maxResults": number
}`;
return this.llm.analyze(prompt);
}
// Filter response based on intent
async filterResponse(
response: any,
intent: IntentAnalysis,
tool: ToolDefinition
): Promise<FilteredResponse> {
// Strategy 1: Structural filtering (if response is array)
if (Array.isArray(response)) {
const filtered = await this.filterArray(response, intent);
return { data: filtered, reduction: 1 - filtered.length / response.length };
}
// Strategy 2: Field selection (for objects)
if (typeof response === 'object') {
const relevant = await this.selectRelevantFields(response, intent);
return { data: relevant, reduction: this.calculateReduction(response, relevant) };
}
// Strategy 3: Text summarization (for large text responses)
if (typeof response === 'string' && response.length > 5000) {
const summary = await this.summarize(response, intent);
return { data: summary, reduction: 1 - summary.length / response.length };
}
return { data: response, reduction: 0 };
}
private async filterArray(items: any[], intent: IntentAnalysis): Promise<any[]> {
// Score each item for relevance
const scored = await Promise.all(
items.map(async (item) => ({
item,
score: await this.scoreRelevance(item, intent)
}))
);
// Return top N most relevant
return scored
.sort((a, b) => b.score - a.score)
.slice(0, intent.maxResults || 10)
.map(s => s.item);
}
private async scoreRelevance(item: any, intent: IntentAnalysis): Promise<number> {
const itemStr = JSON.stringify(item).toLowerCase();
let score = 0;
// Keyword matching
for (const keyword of intent.keywords) {
if (itemStr.includes(keyword.toLowerCase())) score += 1;
}
// Use LLM for deeper analysis if needed
if (score === 0) {
score = await this.llm.scoreRelevance(item, intent.intent);
}
return score;
}
}
```
Example filtering for Slack messages:
```typescript
// User asks: "Get Slack messages about security from my team"
const intent = {
intent: 'Find security-related team messages',
keywords: ['security', 'vulnerability', 'patch', 'CVE'],
filters: { channels: ['team-*', 'security-*'] },
maxResults: 20
};
// Filter 1000 messages down to 20 most relevant
```
**Test Strategy:**
Test intent analysis with various queries. Test filtering reduces data size significantly. Benchmark relevance accuracy. Test with real Slack/Jira data samples.
## Subtasks
### 13.1. Create FilterEngine core infrastructure with TDD and MockLLMProvider
**Status:** pending
**Dependencies:** None
Set up the services/filter-engine.ts file structure with TypeScript interfaces, Vitest test infrastructure, and MockLLMProvider for local testing without external API dependencies.
**Details:**
Create src/services/filter-engine.ts with core types and interfaces. Define IntentAnalysis interface: { intent: string, keywords: string[], filters: Record<string, any>, maxResults: number, confidence: number }. Define FilteredResponse interface: { data: any, reduction: number, metadata: FilterMetadata }. Define FilterMetadata for explainability: { originalItemCount: number, filteredItemCount: number, removedItems: RemovedItemExplanation[], filterStrategy: string, scoringLatencyMs: number }. Define RemovedItemExplanation: { item: any, reason: string, score: number, threshold: number }. Create MockLLMProvider in tests/mocks/mock-llm-provider.ts that returns deterministic responses based on input patterns - essential for CI/CD without GPU. Configure Vitest with coverage requirements (>90%). Create test fixtures in tests/fixtures/ with sample MCP requests/responses for Slack, Jira, database queries. Include createMockToolCallRequest(), createMockIntentAnalysis(), createTestFilterEngine() test utilities.
### 13.2. Implement analyzeIntent method with keyword extraction and configurable parameters
**Status:** pending
**Dependencies:** 13.1
Create the intent analysis system that interprets Claude's MCP tool calls to extract user intent, relevant keywords, filters, and maximum results using LLM-based analysis with configurable prompts.
**Details:**
Implement FilterEngine.analyzeIntent(request: ToolCallRequest): Promise<IntentAnalysis> method. Create IntentAnalyzer class in src/services/intent-analyzer.ts with configurable prompt templates per MCP tool type. Design prompt engineering for reliable JSON output: include examples, schema definition, and output format instructions. Implement keyword extraction with stemming/normalization for better matching. Add confidence scoring to intent analysis (0-1 scale) for downstream filtering decisions. Support tool-specific intent patterns: Slack (channels, date ranges, users), Jira (project, status, assignee), Database (tables, columns, aggregations). Create IntentAnalysisConfig: { promptTemplate: string, maxKeywords: number, includeNegativeKeywords: boolean, confidenceThreshold: number }. Allow data scientists to configure weights and thresholds per MCP type via JSON config file. Implement caching of intent analysis for identical requests (LRU cache with TTL). Add metrics: intent_analysis_latency_ms histogram.
### 13.3. Implement array filtering strategy with relevance scoring and explainability
**Status:** pending
**Dependencies:** 13.1, 13.2
Create the structural filtering strategy for array responses with intelligent relevance scoring, keyword matching, LLM-based deep analysis, and detailed explainability for why items were removed.
**Details:**
Implement FilterEngine.filterArray(items: any[], intent: IntentAnalysis): Promise<FilteredArrayResult> in src/services/filter-strategies/array-filter.ts. Create RelevanceScorer class with configurable scoring: (1) Keyword matching score with configurable weights per keyword, (2) Field importance weights (title > description > metadata), (3) LLM-based semantic scoring for items with zero keyword matches, (4) Composite scoring with normalization. Implement explainability: for each removed item, record { item, reason: 'keyword_score_below_threshold' | 'llm_relevance_low' | 'exceeded_max_results', score, threshold }. Return scored items sorted by relevance with top N based on intent.maxResults. Handle nested arrays recursively. Add A/B testing support: FilterArrayConfig.abTestId allows comparing scoring algorithms. Expose metrics: items_before, items_after, reduction_ratio, avg_score, scoring_latency_ms. Implement batch scoring optimization: score multiple items in single LLM call when possible.
### 13.4. Implement object field selection and text summarization strategies
**Status:** pending
**Dependencies:** 13.1, 13.2
Create filtering strategies for object responses (field selection based on relevance) and large text responses (intelligent summarization) with configurable thresholds and explainability.
**Details:**
Create src/services/filter-strategies/object-filter.ts with selectRelevantFields(obj: object, intent: IntentAnalysis): Promise<FilteredObjectResult>. Implement field relevance scoring: (1) Field name keyword matching, (2) Field value relevance to intent, (3) Configurable always-include fields per object type (e.g., 'id', 'timestamp'). Create FieldSelectionConfig: { preserveFields: string[], maxDepth: number, maxFields: number }. Track removed fields in explainability metadata. Create src/services/filter-strategies/text-filter.ts with summarize(text: string, intent: IntentAnalysis): Promise<SummarizedTextResult>. Implement intelligent summarization: (1) Detect text type (log file, documentation, code), (2) Apply appropriate summarization strategy, (3) Preserve critical information based on intent keywords. Summarization threshold: 5000 chars (configurable). Calculate reduction ratio: 1 - summary.length / original.length. Add metrics: fields_removed, text_reduction_ratio, summarization_latency_ms.
### 13.5. Implement streaming-compatible large dataset filtering with memory efficiency
**Status:** pending
**Dependencies:** 13.1, 13.3
Create filtering logic that integrates with Task 11's chunking/streaming system to handle 100K+ item datasets without loading all data into memory, using incremental scoring and progressive filtering.
**Details:**
Create src/services/filter-strategies/streaming-filter.ts integrating with PaginationManager from Task 11.6. Implement StreamingFilterEngine with methods: (1) createFilterStream(dataStream: AsyncIterable<any[]>, intent): AsyncIterable<FilteredChunk>, (2) processChunk(chunk: any[], runningState: FilterState): Promise<FilteredChunk>. Design FilterState to maintain: running top-N items with scores, min score threshold (dynamically adjusted), chunk index, total items processed. Implement progressive threshold adjustment: as more items are seen, raise threshold to maintain O(maxResults) memory. Use heap data structure for efficient top-N maintenance. Create ChunkedFilterResult: { chunk: any[], chunkIndex: number, runningReduction: number, isComplete: boolean }. Memory budget: configurable max memory for filter state (default 50MB). Add backpressure handling for slow downstream consumers. Expose metrics: chunks_processed, peak_memory_bytes, progressive_threshold.
### 13.6. Implement security layer preventing data leakage in filtered responses
**Status:** pending
**Dependencies:** 13.1, 13.3, 13.4
Create security middleware that sanitizes filtered responses to prevent accidental exposure of PII, credentials, or sensitive data, with configurable detection patterns and audit logging.
**Details:**
Create src/services/filter-security.ts with ResponseSanitizer class. Implement sensitive data detection: (1) Regex patterns for API keys, passwords, tokens (AWS, GitHub, Slack, etc.), (2) PII patterns: email, phone, SSN, credit card, IP addresses, (3) Custom patterns configurable per MCP type. Create SanitizationConfig: { redactPatterns: RegExp[], piiDetection: boolean, auditSensitiveAccess: boolean, allowlist: string[] }. Implement redaction strategies: full replacement with [REDACTED], partial masking (show last 4 chars), or removal. Create FilterSecurityAudit log entry when sensitive data detected: { timestamp, toolName, patternMatched, fieldPath, actionTaken }. Integrate with FilterEngine.filterResponse() as final step before returning. Prevent filtered items from 'leaking back' via explainability metadata - sanitize removed item summaries too. Add metrics: sensitive_data_detected_count, redactions_applied, audit_log_entries.
### 13.7. Implement A/B testing framework and SRE metrics for filter performance monitoring
**Status:** pending
**Dependencies:** 13.1, 13.2, 13.3, 13.4, 13.5, 13.6
Create comprehensive A/B testing infrastructure for comparing filter strategies, plus Prometheus-compatible metrics exposure for SRE monitoring of filter performance and effectiveness.
**Details:**
Create src/services/filter-metrics.ts with FilterMetricsCollector exposing Prometheus metrics: filter_requests_total (counter by tool, strategy), filter_duration_seconds (histogram), items_before_filter (histogram), items_after_filter (histogram), reduction_ratio (histogram), scoring_latency_seconds (histogram by strategy), sensitive_data_detections_total (counter). Create src/services/ab-testing.ts with ABTestingFramework class: Methods: assignExperiment(requestId): ExperimentAssignment, recordOutcome(requestId, metrics): void, getExperimentResults(experimentId): ABTestResults. ExperimentConfig: { id, strategies: FilterStrategy[], trafficSplit: number[], startDate, endDate }. Persist experiment assignments and outcomes for analysis. Create ABTestResults: { experimentId, strategyResults: { strategy, avgReduction, avgLatency, sampleSize }[], statisticalSignificance }. Integrate with FilterEngine: check experiment assignment, use assigned strategy, record outcome metrics. Add /metrics HTTP endpoint serving Prometheus exposition format. Create Grafana dashboard JSON template for filter monitoring.