Product Updates

3 mins

Introducing our newest model: Chandra

October 30, 2025

Last week, we released Chandra, an OCR model that tops the independent olmocr benchmark:

Based on olmOCR's benchmark

Since then, it’s helped read historical letters, trended on Huggingface, gone live on our API, and been used by thousands of people.

Now that the big push is behind us, we’d like to cover a few things that make Chandra special (beyond just topping benchmarks!).  In this post, we’ll talk about Chandra in more depth, and cover how you can get started using it.

Why we trained a new model

Last year, we released our open source repos Surya and Marker.  These tools have been widely adopted, with about 50k combined Github stars.

Marker got popular because it is customizable, very fast, and has good performance across languages.

However, Marker/Surya took a pipeline-based approach, which segmented the page into blocks, inferenced each block separately, and then put them together.

This approach is fast, but can struggle with certain types of documents and pages.  When we talked to our customers, we consistently heard that these features were critical:

  • Handwriting
  • Language support
  • Customizability
  • Form and table extraction
  • Image captioning and extraction
  • Math accuracy

While Marker and Surya do a good job across these cases, as we began to encounter  more and more complex documents, it became clear that the pipeline based approach wasn’t going to solve them.

Think forms with handwriting in them, like this one:

We decided that we needed to move to full page decoding.  This led us to look around at the existing open source models.  Unfortunately, nothing quite hit the set of features and accuracy that we needed.  In particular, no model did layout particularly well, which is critical for things like image extraction.

This led us to train a new model with some unique features.

Key Features

Layout awareness

One of the core decisions we made with Chandra was to make it layout-aware.  Models that are layout aware can figure out where content is on the page, and what that content is.  Being able to identify an image, for example, lets you caption it and cut it out of the page.

Here, you can see how Chandra both extracts and captions the image, which would not be possible without layout:

We can also do this with figures, including extracting structured data.  Here in our playground, you can see that we pull the image and the structured data out from the table:

Math support

Math support was also a key priority for us.  To solve this, we both labeled real-world data, and generated synthetic samples.  We’ve spent a lot of time ensuring best-in-class Math support,  leading us to be adopted by several AI labs.

We have strong performance, even on old fonts and handwritten math pages:

This one has a couple of small mistakes (can you spot them?), but we outperform even Gemini Pro here:

Tables and forms

Tables are the bane of every document parsing tool.  Our old models handled these by splitting them into cells, and decoding each cell separately.  This was fast, but missed detail on complex tables, including ones with text written across.

We set out to fix this with Chandra (fun fact, this page was thrown into a lake, recovered, and dried out):

Forms are handled similarly, by decoding the full page:

As you can see, Chandra catches and include the checkbox - this was something we spent some time on getting right.

Performance

Benchmarks

Here are slightly more detailed benchmarks (as always, using third-party independent benchmarks:

Throughput

For many of our customers, latency and throughput are critical.  We have quantized 8b and 2b versions of the model that you can run on-prem.  Reach out to us at [email protected] to get access.  You can get up to 4 pages/second on an H100, or 345,000 pages per day, with minimal accuracy degradation.

Try Chandra

Self-serve

One of our founding principles has always been to make high-quality models accessible to everyone. Here’s how you can get started:

  • Try Chandra using our open source options, using the versions from HuggingFace or Github.
  • You can demo it in our free playground environment.
  • If you want a hosted version via API, sign up for an account here.   You can get $5 of free credits to try the API out.

If you’re over the revenue limit for the open source version, or you want deeper support, we can help you get started.  Reach out to us at [email protected] for more information.

As a bonus, we can help you eval your documents against other tools, identify issues, and help you pick the best OCR tool for your use-case.

What’s next

We’re training new versions of Chandra now.  In particular, we’re working on adding better support for low resource languages, improving latency, and further improving math.

If you have any suggestions or feedback, please email [email protected], or find me on Twitter!

Table of contents: