Files
mcpctl/.taskmaster/tasks/task_013.md
2026-02-21 03:10:39 +00:00

13 KiB

Task ID: 13

Title: Implement MCP Request/Response Filtering Logic

Status: pending

Dependencies: 11, 12

Priority: medium

Description: Create the intelligent filtering system that analyzes Claude's intent and filters MCP responses to minimize token usage while maximizing relevance.

Details:

Create filtering logic:

// services/filter-engine.ts
export class FilterEngine {
  constructor(private llm: LLMProvider) {}

  // Analyze Claude's request to understand intent
  async analyzeIntent(request: ToolCallRequest): Promise<IntentAnalysis> {
    const prompt = `Analyze this MCP tool call to understand the user's intent:
Tool: ${request.toolName}
Arguments: ${JSON.stringify(request.arguments)}

Output JSON:
{
  "intent": "description of what user wants",
  "keywords": ["relevant", "keywords"],
  "filters": { "date_range": "...", "categories": [...] },
  "maxResults": number
}`;

    return this.llm.analyze(prompt);
  }

  // Filter response based on intent
  async filterResponse(
    response: any,
    intent: IntentAnalysis,
    tool: ToolDefinition
  ): Promise<FilteredResponse> {
    // Strategy 1: Structural filtering (if response is array)
    if (Array.isArray(response)) {
      const filtered = await this.filterArray(response, intent);
      return { data: filtered, reduction: 1 - filtered.length / response.length };
    }

    // Strategy 2: Field selection (for objects)
    if (typeof response === 'object') {
      const relevant = await this.selectRelevantFields(response, intent);
      return { data: relevant, reduction: this.calculateReduction(response, relevant) };
    }

    // Strategy 3: Text summarization (for large text responses)
    if (typeof response === 'string' && response.length > 5000) {
      const summary = await this.summarize(response, intent);
      return { data: summary, reduction: 1 - summary.length / response.length };
    }

    return { data: response, reduction: 0 };
  }

  private async filterArray(items: any[], intent: IntentAnalysis): Promise<any[]> {
    // Score each item for relevance
    const scored = await Promise.all(
      items.map(async (item) => ({
        item,
        score: await this.scoreRelevance(item, intent)
      }))
    );

    // Return top N most relevant
    return scored
      .sort((a, b) => b.score - a.score)
      .slice(0, intent.maxResults || 10)
      .map(s => s.item);
  }

  private async scoreRelevance(item: any, intent: IntentAnalysis): Promise<number> {
    const itemStr = JSON.stringify(item).toLowerCase();
    let score = 0;

    // Keyword matching
    for (const keyword of intent.keywords) {
      if (itemStr.includes(keyword.toLowerCase())) score += 1;
    }

    // Use LLM for deeper analysis if needed
    if (score === 0) {
      score = await this.llm.scoreRelevance(item, intent.intent);
    }

    return score;
  }
}

Example filtering for Slack messages:

// User asks: "Get Slack messages about security from my team"
const intent = {
  intent: 'Find security-related team messages',
  keywords: ['security', 'vulnerability', 'patch', 'CVE'],
  filters: { channels: ['team-*', 'security-*'] },
  maxResults: 20
};

// Filter 1000 messages down to 20 most relevant

Test Strategy:

Test intent analysis with various queries. Test filtering reduces data size significantly. Benchmark relevance accuracy. Test with real Slack/Jira data samples.

Subtasks

13.1. Create FilterEngine core infrastructure with TDD and MockLLMProvider

Status: pending
Dependencies: None

Set up the services/filter-engine.ts file structure with TypeScript interfaces, Vitest test infrastructure, and MockLLMProvider for local testing without external API dependencies.

Details:

Create src/services/filter-engine.ts with core types and interfaces. Define IntentAnalysis interface: { intent: string, keywords: string[], filters: Record<string, any>, maxResults: number, confidence: number }. Define FilteredResponse interface: { data: any, reduction: number, metadata: FilterMetadata }. Define FilterMetadata for explainability: { originalItemCount: number, filteredItemCount: number, removedItems: RemovedItemExplanation[], filterStrategy: string, scoringLatencyMs: number }. Define RemovedItemExplanation: { item: any, reason: string, score: number, threshold: number }. Create MockLLMProvider in tests/mocks/mock-llm-provider.ts that returns deterministic responses based on input patterns - essential for CI/CD without GPU. Configure Vitest with coverage requirements (>90%). Create test fixtures in tests/fixtures/ with sample MCP requests/responses for Slack, Jira, database queries. Include createMockToolCallRequest(), createMockIntentAnalysis(), createTestFilterEngine() test utilities.

13.2. Implement analyzeIntent method with keyword extraction and configurable parameters

Status: pending
Dependencies: 13.1

Create the intent analysis system that interprets Claude's MCP tool calls to extract user intent, relevant keywords, filters, and maximum results using LLM-based analysis with configurable prompts.

Details:

Implement FilterEngine.analyzeIntent(request: ToolCallRequest): Promise method. Create IntentAnalyzer class in src/services/intent-analyzer.ts with configurable prompt templates per MCP tool type. Design prompt engineering for reliable JSON output: include examples, schema definition, and output format instructions. Implement keyword extraction with stemming/normalization for better matching. Add confidence scoring to intent analysis (0-1 scale) for downstream filtering decisions. Support tool-specific intent patterns: Slack (channels, date ranges, users), Jira (project, status, assignee), Database (tables, columns, aggregations). Create IntentAnalysisConfig: { promptTemplate: string, maxKeywords: number, includeNegativeKeywords: boolean, confidenceThreshold: number }. Allow data scientists to configure weights and thresholds per MCP type via JSON config file. Implement caching of intent analysis for identical requests (LRU cache with TTL). Add metrics: intent_analysis_latency_ms histogram.

13.3. Implement array filtering strategy with relevance scoring and explainability

Status: pending
Dependencies: 13.1, 13.2

Create the structural filtering strategy for array responses with intelligent relevance scoring, keyword matching, LLM-based deep analysis, and detailed explainability for why items were removed.

Details:

Implement FilterEngine.filterArray(items: any[], intent: IntentAnalysis): Promise in src/services/filter-strategies/array-filter.ts. Create RelevanceScorer class with configurable scoring: (1) Keyword matching score with configurable weights per keyword, (2) Field importance weights (title > description > metadata), (3) LLM-based semantic scoring for items with zero keyword matches, (4) Composite scoring with normalization. Implement explainability: for each removed item, record { item, reason: 'keyword_score_below_threshold' | 'llm_relevance_low' | 'exceeded_max_results', score, threshold }. Return scored items sorted by relevance with top N based on intent.maxResults. Handle nested arrays recursively. Add A/B testing support: FilterArrayConfig.abTestId allows comparing scoring algorithms. Expose metrics: items_before, items_after, reduction_ratio, avg_score, scoring_latency_ms. Implement batch scoring optimization: score multiple items in single LLM call when possible.

13.4. Implement object field selection and text summarization strategies

Status: pending
Dependencies: 13.1, 13.2

Create filtering strategies for object responses (field selection based on relevance) and large text responses (intelligent summarization) with configurable thresholds and explainability.

Details:

Create src/services/filter-strategies/object-filter.ts with selectRelevantFields(obj: object, intent: IntentAnalysis): Promise. Implement field relevance scoring: (1) Field name keyword matching, (2) Field value relevance to intent, (3) Configurable always-include fields per object type (e.g., 'id', 'timestamp'). Create FieldSelectionConfig: { preserveFields: string[], maxDepth: number, maxFields: number }. Track removed fields in explainability metadata. Create src/services/filter-strategies/text-filter.ts with summarize(text: string, intent: IntentAnalysis): Promise. Implement intelligent summarization: (1) Detect text type (log file, documentation, code), (2) Apply appropriate summarization strategy, (3) Preserve critical information based on intent keywords. Summarization threshold: 5000 chars (configurable). Calculate reduction ratio: 1 - summary.length / original.length. Add metrics: fields_removed, text_reduction_ratio, summarization_latency_ms.

13.5. Implement streaming-compatible large dataset filtering with memory efficiency

Status: pending
Dependencies: 13.1, 13.3

Create filtering logic that integrates with Task 11's chunking/streaming system to handle 100K+ item datasets without loading all data into memory, using incremental scoring and progressive filtering.

Details:

Create src/services/filter-strategies/streaming-filter.ts integrating with PaginationManager from Task 11.6. Implement StreamingFilterEngine with methods: (1) createFilterStream(dataStream: AsyncIterable<any[]>, intent): AsyncIterable, (2) processChunk(chunk: any[], runningState: FilterState): Promise. Design FilterState to maintain: running top-N items with scores, min score threshold (dynamically adjusted), chunk index, total items processed. Implement progressive threshold adjustment: as more items are seen, raise threshold to maintain O(maxResults) memory. Use heap data structure for efficient top-N maintenance. Create ChunkedFilterResult: { chunk: any[], chunkIndex: number, runningReduction: number, isComplete: boolean }. Memory budget: configurable max memory for filter state (default 50MB). Add backpressure handling for slow downstream consumers. Expose metrics: chunks_processed, peak_memory_bytes, progressive_threshold.

13.6. Implement security layer preventing data leakage in filtered responses

Status: pending
Dependencies: 13.1, 13.3, 13.4

Create security middleware that sanitizes filtered responses to prevent accidental exposure of PII, credentials, or sensitive data, with configurable detection patterns and audit logging.

Details:

Create src/services/filter-security.ts with ResponseSanitizer class. Implement sensitive data detection: (1) Regex patterns for API keys, passwords, tokens (AWS, GitHub, Slack, etc.), (2) PII patterns: email, phone, SSN, credit card, IP addresses, (3) Custom patterns configurable per MCP type. Create SanitizationConfig: { redactPatterns: RegExp[], piiDetection: boolean, auditSensitiveAccess: boolean, allowlist: string[] }. Implement redaction strategies: full replacement with [REDACTED], partial masking (show last 4 chars), or removal. Create FilterSecurityAudit log entry when sensitive data detected: { timestamp, toolName, patternMatched, fieldPath, actionTaken }. Integrate with FilterEngine.filterResponse() as final step before returning. Prevent filtered items from 'leaking back' via explainability metadata - sanitize removed item summaries too. Add metrics: sensitive_data_detected_count, redactions_applied, audit_log_entries.

13.7. Implement A/B testing framework and SRE metrics for filter performance monitoring

Status: pending
Dependencies: 13.1, 13.2, 13.3, 13.4, 13.5, 13.6

Create comprehensive A/B testing infrastructure for comparing filter strategies, plus Prometheus-compatible metrics exposure for SRE monitoring of filter performance and effectiveness.

Details:

Create src/services/filter-metrics.ts with FilterMetricsCollector exposing Prometheus metrics: filter_requests_total (counter by tool, strategy), filter_duration_seconds (histogram), items_before_filter (histogram), items_after_filter (histogram), reduction_ratio (histogram), scoring_latency_seconds (histogram by strategy), sensitive_data_detections_total (counter). Create src/services/ab-testing.ts with ABTestingFramework class: Methods: assignExperiment(requestId): ExperimentAssignment, recordOutcome(requestId, metrics): void, getExperimentResults(experimentId): ABTestResults. ExperimentConfig: { id, strategies: FilterStrategy[], trafficSplit: number[], startDate, endDate }. Persist experiment assignments and outcomes for analysis. Create ABTestResults: { experimentId, strategyResults: { strategy, avgReduction, avgLatency, sampleSize }[], statisticalSignificance }. Integrate with FilterEngine: check experiment assignment, use assigned strategy, record outcome metrics. Add /metrics HTTP endpoint serving Prometheus exposition format. Create Grafana dashboard JSON template for filter monitoring.