Advanced guide
Multi-turn context compression
Long conversations are the most expensive LLM use case. After 20 turns, the full history can exceed 10,000 tokens. Multi-turn compression keeps the signal from all turns while dropping the noise.
The multi-turn challenge
After 50 turns in a customer support conversation, the LLM prompt includes: greetings, pleasantries, status checks, internal notes, and off-topic discussions — all mixed with the actual problem-solving context. A customer might mention their account type on turn 3, then ask a billing question on turn 45. The turn-3 detail is critical but would be dropped by any recency-based approach.
Solution
from supercompress import Compressor
comp = Compressor()
def compress_conversation(messages, latest_query):
history = "\n".join(
f"{m['role']}: {m['content']}" for m in messages[:-1]
)
result = comp.compress(history, latest_query)
# Now use result.compressed_text as your conversation history
return result.compressed_text
Frequently asked questions
Does this work for 100+ turn conversations?
Yes. SuperCompress handles conversations of any length efficiently.
Will it keep system prompts and instructions?
Yes. System-level messages are preserved when relevant to the current query.
Try it yourself
Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.