Serverless guide
Serverless prompt compression
Serverless functions have tight resource limits. SuperCompress adds prompt compression in ~60ms with no GPU, no model downloads, and minimal memory — perfect for serverless deployments.
AWS Lambda deployment
# Lambda function that compresses before calling an LLM
import json
from supercompress import Compressor
comp = Compressor()
def lambda_handler(event, context):
body = json.loads(event["body"])
result = comp.compress(body["context"], body["query"])
# Forward compressed context to your LLM
return {
"statusCode": 200,
"body": json.dumps({
"compressed": result.compressed_text,
"savings": result.tokens_removed
})
}
Serverless compatibility
| Platform | Cold Start | Memory | Compression Time |
|---|---|---|---|
| AWS Lambda | ~300ms | ~80MB | ~60ms |
| Google Cloud Functions | ~200ms | ~80MB | ~60ms |
| Cloudflare Workers | ~5ms | ~50MB | ~70ms |
| Vercel Edge Functions | ~50ms | ~60MB | ~65ms |
Frequently asked questions
Does SuperCompress fit in Lambda's /tmp space?
Yes. The package is ~200KB. No model files needed.
Can I use it in edge runtimes?
Yes. The compressor is pure Python and works in edge environments that support Python.
Try it yourself
Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.