RAG guide
Multi-hop RAG compression
Multi-hop RAG answers questions that require multiple retrieval and reasoning steps. Each hop adds more context and more tokens. Compression at each hop keeps the total cost from exploding.
Multi-hop costs
A 3-hop query might: (1) retrieve the user's account info, (2) retrieve the support ticket history, (3) retrieve the relevant knowledge base article. Without compression, each hop adds 2,000-5,000 tokens. With compression, each hop removes irrelevant content from previous hops.
Frequently asked questions
Should I compress at every hop?
Yes. Compress before each LLM call to prevent context accumulation.
Does compression affect multi-hop reasoning?
No. Only irrelevant content is removed at each hop. Reasoning chains are preserved.
Try it yourself
Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.