This sample shows how Datalab transforms an invoice PDF into structured, machine-readable formats.
Provides a clean, human-readable version of the invoice that preserves layout, tables, and hierarchy. Ideal for displaying parsed documents directly in dashboards, chat interfaces, or internal tools where you want to see the data as text while keeping it easy to review or share.
customer_name
, invoice_number
, payment_due_terms
, total
, etc — each mapped to precise regions of the source document. Every extracted value includes citations — coordinates or bounding boxes pointing back to the exact place in the original PDF where the data was found.By extracting both versions, you get the best of both worlds: Markdown for interpretability and presentation, JSON for programmatic use and integration.