Comparison guide
SuperCompress vs MMR
MMR re-ranks retrieved chunks to balance relevance and diversity. SuperCompress selects lines most relevant to the query. They are complementary: MMR for retrieval diversity, SuperCompress for pre-generation focus.
How MMR works
MMR scores each candidate chunk by a weighted combination of relevance to the query and dissimilarity to already-selected chunks. This prevents the retriever from returning 10 nearly-identical chunks about the same topic.
The tradeoff: MMR can sacrifice precision for diversity. A diverse set of chunks means more total tokens, and some diverse chunks may be irrelevant to the specific question.
Combining MMR and SuperCompress
from supercompress import Compressor
comp = Compressor()
def mmr_with_compression(query, retriever):
# Step 1: Retrieve with MMR diversity
chunks = retriever.retrieve(query, k=15, mmr=True, diversity_beta=0.3)
# Step 2: Compress diverse context against the query
context = "\n\n".join(c.text for c in chunks)
result = comp.compress(context, query)
return llm.generate(query, result.compressed_text)
Frequently asked questions
Does MMR increase token count?
Yes. MMR diversifies the selection, which typically means more unique chunks and more total tokens.
Does compression remove MMR's diversity benefit?
No. SuperCompress keeps diverse lines that are relevant to the query. Irrelevant diversity is removed; relevant diversity stays.
Try it yourself
Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.