Advanced guide

Bulk prompt compression

For offline LLM pipelines — batch inference, dataset preparation, bulk content generation — per-request overhead matters. Bulk compression optimizes for throughput at scale.

By Arjun Shah - Creator of SuperCompress - Updated 2026-07-03

Throughput benchmarks

Batch Size	Total Time	Throughput	Memory
100	~6s	~16/sec	~50MB
1,000	~60s	~17/sec	~80MB
10,000	~600s	~17/sec	~200MB

Frequently asked questions

Can I parallelize across multiple cores?

Yes. Use Python's multiprocessing to compress on all CPU cores simultaneously.

Does bulk compression use the API or local?

Local compression. The Compressor runs on CPU with no external API calls.

Try it yourself

Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.

Open the Playground Embed the badge