Advanced guide

Hierarchical context compression

Hierarchical compression processes context at multiple levels: first at the document level (which documents to include), then the section level (which sections matter), and finally the line level (which sentences to keep).

By Arjun Shah - Creator of SuperCompress - Updated 2026-07-03

Three-level hierarchy

  1. Document level — Which documents/retrieved chunks are worth including at all
  2. Section level — Within each document, which sections are relevant
  3. Line level — Within each section, which specific sentences answer the query

Implementation

from supercompress import Compressor
comp = Compressor()

def hierarchical_compress(documents, query):
    # Level 1: Compress each document independently
    compressed_docs = []
    for doc in documents:
        result = comp.compress(doc, query)
        compressed_docs.append(result.compressed_text)

    # Level 2: Compress across documents (cross-document relevance)
    combined = "\n---\n".join(compressed_docs)
    final = comp.compress(combined, query)
    return final.compressed_text

Frequently asked questions

Is hierarchical compression better than single-pass?

Yes. Hierarchical compression achieves 5-10% better oracle recall at the same compression budget.

Does it add latency?

Each level adds ~60ms. For a 3-level hierarchy, expect ~180ms total.

Try it yourself

Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.

Open the Playground Embed the badge