Tools / Context Compression

context_compress

$0.03 per call

Smart compression with BM25 relevance ranking. Keeps the parts relevant to your query, compresses the rest. Perfect for fitting large files into context windows.

Get Started
How it works:
1. Provide content + query
2. BM25 ranks relevance
3. High-relevance kept intact
4. Low-relevance summarized
Capabilities

What it does

BM25 relevance ranking for intelligent compression
Preserves high-relevance content in full
Summarizes low-relevance sections efficiently
Typically achieves 70%+ compression rates

Why use it

Fit entire projects into context windows
Keep relevant context while staying under token limits
Query-aware compression, not blind truncation
Works on code, documents, and conversation history
Example
Input (5000 tokens)

// authentication.js - 200 lines

// database.js - 300 lines

// routes.js - 400 lines

// utils.js - 100 lines

Query: "How does auth work?"

Output (1500 tokens)

// authentication.js - FULL (200 lines)

// database.js - Summarized to 50 lines

// routes.js - Auth routes kept, rest summarized

// utils.js - Summarized to 20 lines

70% compression, auth context preserved

Use Cases

Large Codebases

Fit entire projects into context by keeping relevant files full and summarizing the rest.

Long Documents

Ask questions about PDFs, docs, or reports without hitting token limits.

Conversation History

Compress old messages while keeping recent context intact.

Ready to start building?

Get your API key and start using context_compress today

Get Started