Advanced guide

Multi-turn context compression

Long conversations are the most expensive LLM use case. After 20 turns, the full history can exceed 10,000 tokens. Multi-turn compression keeps the signal from all turns while dropping the noise.

By Arjun Shah - Creator of SuperCompress - Updated 2026-07-03

The multi-turn challenge

After 50 turns in a customer support conversation, the LLM prompt includes: greetings, pleasantries, status checks, internal notes, and off-topic discussions — all mixed with the actual problem-solving context. A customer might mention their account type on turn 3, then ask a billing question on turn 45. The turn-3 detail is critical but would be dropped by any recency-based approach.

Solution

from supercompress import Compressor
comp = Compressor()

def compress_conversation(messages, latest_query):
    history = "\n".join(
        f"{m['role']}: {m['content']}" for m in messages[:-1]
    )
    result = comp.compress(history, latest_query)
    # Now use result.compressed_text as your conversation history
    return result.compressed_text

Frequently asked questions

Does this work for 100+ turn conversations?

Yes. SuperCompress handles conversations of any length efficiently.

Will it keep system prompts and instructions?

Yes. System-level messages are preserved when relevant to the current query.

Try it yourself

Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.

Open the Playground Embed the badge