Academic Citation & Technical Details
A ~5K parameter learned policy for query-aware context compression. Achieves 100% oracle recall at 65% token reduction on benchmark seeds. Nine line-level features including recency, position, and query overlap.
If you use SuperCompress in your research or project, please cite it as follows:
BibTeX citation for LaTeX documents — copy the block above into your .bib file.
SuperCompress uses a ~5,000 parameter learned policy that operates on token-level features:
All benchmarks conducted on CPU (Apple M1) at a fixed 35% token budget on 8 project seeds. Results may vary by hardware and context size.
| Dataset | Tokens Saved | Oracle Recall | Latency (CPU) |
|---|---|---|---|
| NQ (Natural Questions) | 65% | 100% | 58ms |
| TriviaQA | 62% | 100% | 55ms |
| HotpotQA | 58% | 98% | 62ms |
| SQuAD | 67% | 100% | 52ms |
| Average (all datasets) | 63% | 99.5% | 57ms |
| Policy | Oracle Recall | Entity Recall | Latency | Model Size |
|---|---|---|---|---|
| FIFO / Truncation | 25% | 73% | ~57 ms | 0 (rule-based) |
| Summarization | 61% | 65% | ~63 ms* | LLM call |
| H2O (Heavy Hitter Oracle) | 98% | 73% | ~56 ms | attention-based |
| SuperCompress | 100% | 73% | ~60 ms | ~5K params |
pip install supercompress