first commit
This commit is contained in:
195
.taskmaster/tasks/task_013.md
Normal file
195
.taskmaster/tasks/task_013.md
Normal file
@@ -0,0 +1,195 @@
|
||||
# Task ID: 13
|
||||
|
||||
**Title:** Implement MCP Request/Response Filtering Logic
|
||||
|
||||
**Status:** pending
|
||||
|
||||
**Dependencies:** 11, 12
|
||||
|
||||
**Priority:** medium
|
||||
|
||||
**Description:** Create the intelligent filtering system that analyzes Claude's intent and filters MCP responses to minimize token usage while maximizing relevance.
|
||||
|
||||
**Details:**
|
||||
|
||||
Create filtering logic:
|
||||
|
||||
```typescript
|
||||
// services/filter-engine.ts
|
||||
export class FilterEngine {
|
||||
constructor(private llm: LLMProvider) {}
|
||||
|
||||
// Analyze Claude's request to understand intent
|
||||
async analyzeIntent(request: ToolCallRequest): Promise<IntentAnalysis> {
|
||||
const prompt = `Analyze this MCP tool call to understand the user's intent:
|
||||
Tool: ${request.toolName}
|
||||
Arguments: ${JSON.stringify(request.arguments)}
|
||||
|
||||
Output JSON:
|
||||
{
|
||||
"intent": "description of what user wants",
|
||||
"keywords": ["relevant", "keywords"],
|
||||
"filters": { "date_range": "...", "categories": [...] },
|
||||
"maxResults": number
|
||||
}`;
|
||||
|
||||
return this.llm.analyze(prompt);
|
||||
}
|
||||
|
||||
// Filter response based on intent
|
||||
async filterResponse(
|
||||
response: any,
|
||||
intent: IntentAnalysis,
|
||||
tool: ToolDefinition
|
||||
): Promise<FilteredResponse> {
|
||||
// Strategy 1: Structural filtering (if response is array)
|
||||
if (Array.isArray(response)) {
|
||||
const filtered = await this.filterArray(response, intent);
|
||||
return { data: filtered, reduction: 1 - filtered.length / response.length };
|
||||
}
|
||||
|
||||
// Strategy 2: Field selection (for objects)
|
||||
if (typeof response === 'object') {
|
||||
const relevant = await this.selectRelevantFields(response, intent);
|
||||
return { data: relevant, reduction: this.calculateReduction(response, relevant) };
|
||||
}
|
||||
|
||||
// Strategy 3: Text summarization (for large text responses)
|
||||
if (typeof response === 'string' && response.length > 5000) {
|
||||
const summary = await this.summarize(response, intent);
|
||||
return { data: summary, reduction: 1 - summary.length / response.length };
|
||||
}
|
||||
|
||||
return { data: response, reduction: 0 };
|
||||
}
|
||||
|
||||
private async filterArray(items: any[], intent: IntentAnalysis): Promise<any[]> {
|
||||
// Score each item for relevance
|
||||
const scored = await Promise.all(
|
||||
items.map(async (item) => ({
|
||||
item,
|
||||
score: await this.scoreRelevance(item, intent)
|
||||
}))
|
||||
);
|
||||
|
||||
// Return top N most relevant
|
||||
return scored
|
||||
.sort((a, b) => b.score - a.score)
|
||||
.slice(0, intent.maxResults || 10)
|
||||
.map(s => s.item);
|
||||
}
|
||||
|
||||
private async scoreRelevance(item: any, intent: IntentAnalysis): Promise<number> {
|
||||
const itemStr = JSON.stringify(item).toLowerCase();
|
||||
let score = 0;
|
||||
|
||||
// Keyword matching
|
||||
for (const keyword of intent.keywords) {
|
||||
if (itemStr.includes(keyword.toLowerCase())) score += 1;
|
||||
}
|
||||
|
||||
// Use LLM for deeper analysis if needed
|
||||
if (score === 0) {
|
||||
score = await this.llm.scoreRelevance(item, intent.intent);
|
||||
}
|
||||
|
||||
return score;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Example filtering for Slack messages:
|
||||
```typescript
|
||||
// User asks: "Get Slack messages about security from my team"
|
||||
const intent = {
|
||||
intent: 'Find security-related team messages',
|
||||
keywords: ['security', 'vulnerability', 'patch', 'CVE'],
|
||||
filters: { channels: ['team-*', 'security-*'] },
|
||||
maxResults: 20
|
||||
};
|
||||
|
||||
// Filter 1000 messages down to 20 most relevant
|
||||
```
|
||||
|
||||
**Test Strategy:**
|
||||
|
||||
Test intent analysis with various queries. Test filtering reduces data size significantly. Benchmark relevance accuracy. Test with real Slack/Jira data samples.
|
||||
|
||||
## Subtasks
|
||||
|
||||
### 13.1. Create FilterEngine core infrastructure with TDD and MockLLMProvider
|
||||
|
||||
**Status:** pending
|
||||
**Dependencies:** None
|
||||
|
||||
Set up the services/filter-engine.ts file structure with TypeScript interfaces, Vitest test infrastructure, and MockLLMProvider for local testing without external API dependencies.
|
||||
|
||||
**Details:**
|
||||
|
||||
Create src/services/filter-engine.ts with core types and interfaces. Define IntentAnalysis interface: { intent: string, keywords: string[], filters: Record<string, any>, maxResults: number, confidence: number }. Define FilteredResponse interface: { data: any, reduction: number, metadata: FilterMetadata }. Define FilterMetadata for explainability: { originalItemCount: number, filteredItemCount: number, removedItems: RemovedItemExplanation[], filterStrategy: string, scoringLatencyMs: number }. Define RemovedItemExplanation: { item: any, reason: string, score: number, threshold: number }. Create MockLLMProvider in tests/mocks/mock-llm-provider.ts that returns deterministic responses based on input patterns - essential for CI/CD without GPU. Configure Vitest with coverage requirements (>90%). Create test fixtures in tests/fixtures/ with sample MCP requests/responses for Slack, Jira, database queries. Include createMockToolCallRequest(), createMockIntentAnalysis(), createTestFilterEngine() test utilities.
|
||||
|
||||
### 13.2. Implement analyzeIntent method with keyword extraction and configurable parameters
|
||||
|
||||
**Status:** pending
|
||||
**Dependencies:** 13.1
|
||||
|
||||
Create the intent analysis system that interprets Claude's MCP tool calls to extract user intent, relevant keywords, filters, and maximum results using LLM-based analysis with configurable prompts.
|
||||
|
||||
**Details:**
|
||||
|
||||
Implement FilterEngine.analyzeIntent(request: ToolCallRequest): Promise<IntentAnalysis> method. Create IntentAnalyzer class in src/services/intent-analyzer.ts with configurable prompt templates per MCP tool type. Design prompt engineering for reliable JSON output: include examples, schema definition, and output format instructions. Implement keyword extraction with stemming/normalization for better matching. Add confidence scoring to intent analysis (0-1 scale) for downstream filtering decisions. Support tool-specific intent patterns: Slack (channels, date ranges, users), Jira (project, status, assignee), Database (tables, columns, aggregations). Create IntentAnalysisConfig: { promptTemplate: string, maxKeywords: number, includeNegativeKeywords: boolean, confidenceThreshold: number }. Allow data scientists to configure weights and thresholds per MCP type via JSON config file. Implement caching of intent analysis for identical requests (LRU cache with TTL). Add metrics: intent_analysis_latency_ms histogram.
|
||||
|
||||
### 13.3. Implement array filtering strategy with relevance scoring and explainability
|
||||
|
||||
**Status:** pending
|
||||
**Dependencies:** 13.1, 13.2
|
||||
|
||||
Create the structural filtering strategy for array responses with intelligent relevance scoring, keyword matching, LLM-based deep analysis, and detailed explainability for why items were removed.
|
||||
|
||||
**Details:**
|
||||
|
||||
Implement FilterEngine.filterArray(items: any[], intent: IntentAnalysis): Promise<FilteredArrayResult> in src/services/filter-strategies/array-filter.ts. Create RelevanceScorer class with configurable scoring: (1) Keyword matching score with configurable weights per keyword, (2) Field importance weights (title > description > metadata), (3) LLM-based semantic scoring for items with zero keyword matches, (4) Composite scoring with normalization. Implement explainability: for each removed item, record { item, reason: 'keyword_score_below_threshold' | 'llm_relevance_low' | 'exceeded_max_results', score, threshold }. Return scored items sorted by relevance with top N based on intent.maxResults. Handle nested arrays recursively. Add A/B testing support: FilterArrayConfig.abTestId allows comparing scoring algorithms. Expose metrics: items_before, items_after, reduction_ratio, avg_score, scoring_latency_ms. Implement batch scoring optimization: score multiple items in single LLM call when possible.
|
||||
|
||||
### 13.4. Implement object field selection and text summarization strategies
|
||||
|
||||
**Status:** pending
|
||||
**Dependencies:** 13.1, 13.2
|
||||
|
||||
Create filtering strategies for object responses (field selection based on relevance) and large text responses (intelligent summarization) with configurable thresholds and explainability.
|
||||
|
||||
**Details:**
|
||||
|
||||
Create src/services/filter-strategies/object-filter.ts with selectRelevantFields(obj: object, intent: IntentAnalysis): Promise<FilteredObjectResult>. Implement field relevance scoring: (1) Field name keyword matching, (2) Field value relevance to intent, (3) Configurable always-include fields per object type (e.g., 'id', 'timestamp'). Create FieldSelectionConfig: { preserveFields: string[], maxDepth: number, maxFields: number }. Track removed fields in explainability metadata. Create src/services/filter-strategies/text-filter.ts with summarize(text: string, intent: IntentAnalysis): Promise<SummarizedTextResult>. Implement intelligent summarization: (1) Detect text type (log file, documentation, code), (2) Apply appropriate summarization strategy, (3) Preserve critical information based on intent keywords. Summarization threshold: 5000 chars (configurable). Calculate reduction ratio: 1 - summary.length / original.length. Add metrics: fields_removed, text_reduction_ratio, summarization_latency_ms.
|
||||
|
||||
### 13.5. Implement streaming-compatible large dataset filtering with memory efficiency
|
||||
|
||||
**Status:** pending
|
||||
**Dependencies:** 13.1, 13.3
|
||||
|
||||
Create filtering logic that integrates with Task 11's chunking/streaming system to handle 100K+ item datasets without loading all data into memory, using incremental scoring and progressive filtering.
|
||||
|
||||
**Details:**
|
||||
|
||||
Create src/services/filter-strategies/streaming-filter.ts integrating with PaginationManager from Task 11.6. Implement StreamingFilterEngine with methods: (1) createFilterStream(dataStream: AsyncIterable<any[]>, intent): AsyncIterable<FilteredChunk>, (2) processChunk(chunk: any[], runningState: FilterState): Promise<FilteredChunk>. Design FilterState to maintain: running top-N items with scores, min score threshold (dynamically adjusted), chunk index, total items processed. Implement progressive threshold adjustment: as more items are seen, raise threshold to maintain O(maxResults) memory. Use heap data structure for efficient top-N maintenance. Create ChunkedFilterResult: { chunk: any[], chunkIndex: number, runningReduction: number, isComplete: boolean }. Memory budget: configurable max memory for filter state (default 50MB). Add backpressure handling for slow downstream consumers. Expose metrics: chunks_processed, peak_memory_bytes, progressive_threshold.
|
||||
|
||||
### 13.6. Implement security layer preventing data leakage in filtered responses
|
||||
|
||||
**Status:** pending
|
||||
**Dependencies:** 13.1, 13.3, 13.4
|
||||
|
||||
Create security middleware that sanitizes filtered responses to prevent accidental exposure of PII, credentials, or sensitive data, with configurable detection patterns and audit logging.
|
||||
|
||||
**Details:**
|
||||
|
||||
Create src/services/filter-security.ts with ResponseSanitizer class. Implement sensitive data detection: (1) Regex patterns for API keys, passwords, tokens (AWS, GitHub, Slack, etc.), (2) PII patterns: email, phone, SSN, credit card, IP addresses, (3) Custom patterns configurable per MCP type. Create SanitizationConfig: { redactPatterns: RegExp[], piiDetection: boolean, auditSensitiveAccess: boolean, allowlist: string[] }. Implement redaction strategies: full replacement with [REDACTED], partial masking (show last 4 chars), or removal. Create FilterSecurityAudit log entry when sensitive data detected: { timestamp, toolName, patternMatched, fieldPath, actionTaken }. Integrate with FilterEngine.filterResponse() as final step before returning. Prevent filtered items from 'leaking back' via explainability metadata - sanitize removed item summaries too. Add metrics: sensitive_data_detected_count, redactions_applied, audit_log_entries.
|
||||
|
||||
### 13.7. Implement A/B testing framework and SRE metrics for filter performance monitoring
|
||||
|
||||
**Status:** pending
|
||||
**Dependencies:** 13.1, 13.2, 13.3, 13.4, 13.5, 13.6
|
||||
|
||||
Create comprehensive A/B testing infrastructure for comparing filter strategies, plus Prometheus-compatible metrics exposure for SRE monitoring of filter performance and effectiveness.
|
||||
|
||||
**Details:**
|
||||
|
||||
Create src/services/filter-metrics.ts with FilterMetricsCollector exposing Prometheus metrics: filter_requests_total (counter by tool, strategy), filter_duration_seconds (histogram), items_before_filter (histogram), items_after_filter (histogram), reduction_ratio (histogram), scoring_latency_seconds (histogram by strategy), sensitive_data_detections_total (counter). Create src/services/ab-testing.ts with ABTestingFramework class: Methods: assignExperiment(requestId): ExperimentAssignment, recordOutcome(requestId, metrics): void, getExperimentResults(experimentId): ABTestResults. ExperimentConfig: { id, strategies: FilterStrategy[], trafficSplit: number[], startDate, endDate }. Persist experiment assignments and outcomes for analysis. Create ABTestResults: { experimentId, strategyResults: { strategy, avgReduction, avgLatency, sampleSize }[], statisticalSignificance }. Integrate with FilterEngine: check experiment assignment, use assigned strategy, record outcome metrics. Add /metrics HTTP endpoint serving Prometheus exposition format. Create Grafana dashboard JSON template for filter monitoring.
|
||||
Reference in New Issue
Block a user