Comparison guide
SuperCompress vs HyDE
HyDE generates a hypothetical ideal document from the query and uses its embedding for retrieval. SuperCompress works on the other end — compress retrieved context before generation. They solve different problems and work great together.
How HyDE works
HyDE asks an LLM to generate a hypothetical document that would perfectly answer the query, then uses that document's embedding for retrieval. This bridges the vocabulary gap between queries and documents, improving recall for difficult queries.
The catch: generating the hypothetical document costs an extra LLM call. On a 10,000-query/day pipeline, that adds significant latency and cost.
Where SuperCompress fits
SuperCompress operates after retrieval, not during it. It takes the chunks returned by any retriever (HyDE-enhanced or not) and compresses them against the query before generation. This means:
- With HyDE: Compress the HyDE-retrieved chunks to remove noise from the broader retrieval
- Without HyDE: Compress standard top-K results for a similar quality boost without the extra LLM call
Cost comparison
| Approach | Extra Cost | Latency | Quality Boost |
|---|---|---|---|
| HyDE | 1 full LLM call | +500-2000ms | Moderate |
| SuperCompress | ~60ms CPU | +60ms | 12-18% |
| HyDE + SuperCompress | 1 LLM call + 60ms | +560-2060ms | Highest |
Frequently asked questions
Can SuperCompress replace HyDE?
For cost-sensitive applications, yes. SuperCompress provides a similar quality boost at 60ms instead of 500-2000ms.
Do they work together?
Yes. Use HyDE for retrieval and SuperCompress for pre-generation compression.
Try it yourself
Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.