Tools / Context Compression
context_compress
$0.03 per call
Smart compression with BM25 relevance ranking. Keeps the parts relevant to your query, compresses the rest. Perfect for fitting large files into context windows.
Get StartedHow it works:
1. Provide content + query
2. BM25 ranks relevance
3. High-relevance kept intact
4. Low-relevance summarized
Capabilities
What it does
BM25 relevance ranking for intelligent compression
Preserves high-relevance content in full
Summarizes low-relevance sections efficiently
Typically achieves 70%+ compression rates
Why use it
Fit entire projects into context windows
Keep relevant context while staying under token limits
Query-aware compression, not blind truncation
Works on code, documents, and conversation history
Example
Input (5000 tokens)
// authentication.js - 200 lines
// database.js - 300 lines
// routes.js - 400 lines
// utils.js - 100 lines
Query: "How does auth work?"
Output (1500 tokens)
// authentication.js - FULL (200 lines)
// database.js - Summarized to 50 lines
// routes.js - Auth routes kept, rest summarized
// utils.js - Summarized to 20 lines
70% compression, auth context preserved
Use Cases
Large Codebases
Fit entire projects into context by keeping relevant files full and summarizing the rest.
Long Documents
Ask questions about PDFs, docs, or reports without hitting token limits.
Conversation History
Compress old messages while keeping recent context intact.