Integration guide
FastAPI compression middleware
FastAPI is the most popular Python web framework for AI backends. Add SuperCompress as middleware to automatically compress all LLM-bound requests.
Middleware implementation
from fastapi import FastAPI, Request
from supercompress import Compressor
app = FastAPI()
comp = Compressor()
@app.middleware("http")
async def compress_prompts(request: Request, call_next):
if request.url.path == "/api/llm/chat":
body = await request.json()
context = body.get("context", "")
query = body.get("query", "")
if context and query:
result = comp.compress(context, query)
body["context"] = result.compressed_text
request._body = json.dumps(body).encode()
return await call_next(request)
Frequently asked questions
Does this work with streaming endpoints?
Yes. Compress the request body before the streaming response starts.
Can I exclude certain endpoints from compression?
Yes. Check the request path and skip compression for non-LLM endpoints.
Try it yourself
Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.