RAG guide
Dynamic chunk compression for RAG
Not all queries need the same amount of context. Simple questions only need 1-2 chunks; complex questions might need 10+. Dynamic chunk compression adapts the context size to the query.
Adaptive strategy
Classify the query by complexity: simple (1-2 chunks), medium (3-5 chunks), complex (6-15 chunks). Retrieval retrieves the appropriate number of chunks, and SuperCompress compresses them against the query before generation. Simple queries pay for fewer tokens; complex queries get the context they need.
Frequently asked questions
How do I classify query complexity?
By token length, number of entities mentioned, or using a fast classifier model.
Does this work with any retriever?
Yes. The retriever returns chunks, and the compressor selects the relevant ones.
Try it yourself
Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.