Problem-specific

Data extraction compression

Data extraction with LLMs sends source documents and extraction schemas. Compression removes irrelevant source content while preserving the lines containing extraction targets.

By Arjun Shah - Creator of SuperCompress - Updated 2026-07-03

Extraction with compression

from supercompress import Compressor
comp = Compressor()

def extract_data(source_text, schema):
    # Compress source against the extraction schema
    result = comp.compress(source_text, schema)
    return llm.generate(
        f"Extract: {schema}\nFrom: {result.compressed_text}"
    )

Frequently asked questions

Does compression lose extraction targets?

No. Only non-extraction content is removed. Lines containing target fields are preserved.

Can I extract from multiple documents?

Yes. Compress each document independently and combine results.

Try it yourself

Paste your long prompt into the playground, ask a question, and see what SuperCompress keeps and removes. Free, no signup needed.

Open the Playground Embed the badge