Product Updates
2 mins
December 1, 2025
We recently released Chandra, a SoTA OCR model. Over the last few weeks, we’ve been working closely with our customers to identify issues and make improvements. We recently released Chandra 1.1 to our API, which is a significant improvement on the 1.0 model.
In this post, we’ll cover some of those improvements, and show examples.
If you want to try Chandra 1.1, you can check out our public playground or sign up for our API. We ‘ll open source future versions of Chandra once we’re ready for a major version bump.
Chandra 1.0 was good at layout, but it made mistakes, especially on long documents like newspapers. Chandra 1.1 has been optimized for these edge cases, and can reliably identify layouts on even very complex documents.

Chandra 1.0 has strong math performance, but Chandra 1.1 makes significant improvements, especially in scientific papers. This uses the olmocr benchmark math components (arxiv math and old scans math, which include scientific papers heavily):

We’ve improved how we parse long, complex tables. We can now extract full results from tables with several hundred cells, including cells with handwriting.


We’ve made significant improvements to performance on Arabic, Indic, and other languages. We now fully support all 80+ languages that surya does.
We also lead KITAB-Bench, an Arabic text benchmark, by a wide margin. Note that many of these models are outdated since we pulled results directly from the KITAB-Bench repo.

We’re continuing to make updates to Chandra, and will ship 1.2 soon. If you have any specific edge cases we can help with, email us at [email protected] .