Invoice extraction benchmark kit

Invoice extraction benchmarking, packaged.

Compare invoice extraction approaches with a local benchmark kit that includes ground truth, evaluator, reports, and a clear protocol.

What it is

  • packaged invoice extraction benchmark kits for US English invoices
  • built for local evaluation, regression checks, and buyer-safe demos
  • field and line-item extraction benchmark

Inside the benchmark kit

Representative screenshots from the shipped artifact.

Minimal / clean Representative minimal invoice screenshot from the benchmark pack
Representative clean invoice Minimal layout example from the current starter release.
Dense / clean Representative dense invoice screenshot from the benchmark pack
Structured layout example Denser table-oriented render from the shipped pack.
Light noisy Representative light-noisy invoice screenshot from the benchmark pack
Light-noisy example Harder render variant from the same release line.
Report preview Sample benchmark report summary preview
Sample benchmark report Core metrics and coverage summary from the shipped report surface.
Benchmark card Benchmark card overview preview
Benchmark card overview Compact pack summary used in the buyer review path.

How it works

Three steps, local workflow.

1. Run your extractor

Use the shipped invoice PDFs with your model, OCR pipeline, or parser.

2. Score predictions

Run the local evaluator against the shipped labels and inspect the report.

3. Compare changes

Reuse the same benchmark surface when you update prompts, models, or parsing logic.

Pricing

Three tiers. One workflow.

Free Sample

$0

Inspect the workflow and pack shape. Direct download.

Full

$990

Broader coverage for repeated evaluation and higher-confidence comparison work.

Limitations

What this is not for.

Next step

Inspect the artifact first.

Start with the free sample. Move to Starter when you want the main paid benchmark for real evaluation work.

Need a custom benchmark or dataset? Get in touch — we build benchmarks and datasets for any document type or domain.