Comparison guide

SuperCompress vs MMR

MMR re-ranks retrieved chunks to balance relevance and diversity. SuperCompress selects lines most relevant to the query. They are complementary: MMR for retrieval diversity, SuperCompress for pre-generation focus.

By Arjun Shah - Creator of SuperCompress - Updated 2026-07-03

How MMR works

MMR scores each candidate chunk by a weighted combination of relevance to the query and dissimilarity to already-selected chunks. This prevents the retriever from returning 10 nearly-identical chunks about the same topic.

The tradeoff: MMR can sacrifice precision for diversity. A diverse set of chunks means more total tokens, and some diverse chunks may be irrelevant to the specific question.

Combining MMR and SuperCompress

from supercompress import Compressor
comp = Compressor()

def mmr_with_compression(query, retriever):
    # Step 1: Retrieve with MMR diversity
    chunks = retriever.retrieve(query, k=15, mmr=True, diversity_beta=0.3)

    # Step 2: Compress diverse context against the query
    context = "\n\n".join(c.text for c in chunks)
    result = comp.compress(context, query)

    return llm.generate(query, result.compressed_text)

Frequently asked questions

Does MMR increase token count?

Yes. MMR diversifies the selection, which typically means more unique chunks and more total tokens.

Does compression remove MMR's diversity benefit?

No. SuperCompress keeps diverse lines that are relevant to the query. Irrelevant diversity is removed; relevant diversity stays.

Try it yourself

Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.

Open the Playground Embed the badge