Marker

Marker is a PDF to Markdown converter that recognizes tables, OCRs equations, and re-OCRs bad pdf text. Marker has 8000+ stars on Github, benchmarks well against other similar tools, and is used by hundreds of organizations.

Marker PDF to Markdown converter

Here's what marker can do.

Converts tables

Marker identifies tables and converts them to Github-flavored markdown.

OCR

Marker will automatically OCR documents that don't have OCR text.

Equations

Equations will be identified and OCRed automatically.

Images

Images and Figures will be identified and saved along with the markdown output.

Speed

Marker is 4x faster than nougat, and can be parallelized easily.

Any language

Marker will work with any language.