Advanced guide
Bulk prompt compression
For offline LLM pipelines — batch inference, dataset preparation, bulk content generation — per-request overhead matters. Bulk compression optimizes for throughput at scale.
Throughput benchmarks
| Batch Size | Total Time | Throughput | Memory |
|---|---|---|---|
| 100 | ~6s | ~16/sec | ~50MB |
| 1,000 | ~60s | ~17/sec | ~80MB |
| 10,000 | ~600s | ~17/sec | ~200MB |
Frequently asked questions
Can I parallelize across multiple cores?
Yes. Use Python's multiprocessing to compress on all CPU cores simultaneously.
Does bulk compression use the API or local?
Local compression. The Compressor runs on CPU with no external API calls.
Try it yourself
Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.