Python integration guide
Python prompt compression guide
Python is the primary language for LLM application development. SuperCompress is a native Python library that compresses prompts before they reach your model, cutting token costs by ~65% on every API call.
Installation
pip install supercompress
That is it. The package includes the compression policy, Python bindings, and CLI tool. No GPU, no model download, no external dependencies.
Basic usage
from supercompress import Compressor
comp = Compressor()
result = comp.compress(
context="""Alice has been a Premium subscriber since 2021.
Her account type is Enterprise.
She last logged in on June 14, 2026.
Her current plan costs $199/month.""",
query="What plan is Alice on?"
)
print(result.compressed_text)
# Output keeps: "Her current plan costs $199/month."
# Drops: login date, subscriber since info (irrelevant to the question)
print(f"Saved {result.tokens_removed} tokens")
Integration with OpenAI
import openai
from supercompress import Compressor
comp = Compressor()
client = openai.OpenAI()
def ask_with_compression(context, query):
result = comp.compress(context, query)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Answer based on the context."},
{"role": "user", "content": result.compressed_text},
]
)
return response.choices[0].message.content
Frequently asked questions
Does SuperCompress work with Python 3.9+?
Yes. SuperCompress supports Python 3.9 through 3.13.
Can I use it with async Python?
Yes. The compressor is synchronous and fast (~60ms), so it works in async contexts without blocking.
Try it yourself
Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.