Product Updates

2 mins

Launch Week - Day 1: Chandra 1.1

December 1, 2025

We recently released Chandra, a SoTA OCR model.  Over the last few weeks, we’ve been working closely with our customers to identify issues and make improvements.  We recently released Chandra 1.1 to our API, which is a significant improvement on the 1.0 model.

In this post, we’ll cover some of those improvements, and show examples.

If you want to try Chandra 1.1, you can check out our public playground or sign up for our API.   We ‘ll open source future versions of Chandra once we’re ready for a major version bump.

Layout

Chandra 1.0 was good at layout, but it made mistakes, especially on long documents like newspapers.  Chandra 1.1 has been optimized for these edge cases, and can reliably identify layouts on even very complex documents.

Math

Chandra 1.0 has strong math performance, but Chandra 1.1 makes significant improvements, especially in scientific papers.  This uses the olmocr benchmark math components (arxiv math and old scans math, which include scientific papers heavily):

Tables

We’ve improved how we parse long, complex tables.  We can now extract full results from tables with several hundred cells, including cells with handwriting.

Multilingual performance

We’ve made significant improvements to performance on Arabic, Indic, and other languages.  We now fully support all 80+ languages that surya does.

We also lead KITAB-Bench, an Arabic text benchmark, by a wide margin.  Note that many of these models are outdated since we pulled results directly from the KITAB-Bench repo.

Next steps

We’re continuing to make updates to Chandra, and will ship 1.2 soon.  If you have any specific edge cases we can help with, email us at [email protected] .

Table of contents: