DeepSeek OCR Artificial Intelligence model processes 200,000 pages a day on one Nvidia A100

DeepSeek introduced an open-source OCR context compression model that converts long documents into compact visual tokens for faster model training. The system processes about 200,000 pages per day on a single Nvidia A100 while maintaining up to 97 percent recognition precision at sub-10x compression.

As compute costs surge across Artificial Intelligence data centers, DeepSeek is leaning on model efficiency with a newly announced, open-source OCR context compression system. The DeepSeek-OCR approach uses optical mapping to convert lengthy text documents into images, achieving a 97 percent recognition precision at compression ratios below 10x. By pairing advanced encoder and decoder components, the system can convert more than nine text tokens into a single visual token, sharply cutting the number of tokens that downstream models must process and, in turn, the compute required for training and inference.

The efficiency gains translate into notable throughput on commodity accelerators. DeepSeek reports that a single Nvidia A100 can process roughly 200,000 document pages per day, while a 20-node A100 cluster can handle about 33 million pages daily. Even at a 20x compression ratio, the system maintains 60 percent optical recognition accuracy. On the OmniDocBench ranking, DeepSeek-OCR outperforms established alternatives such as GOT-OCR2.0 and MinerU2.0 by using fewer vision tokens per page, underscoring its token efficiency. The company positions this work as part of its broader push to deliver open-source models with lower training costs than offerings like OpenAI’s ChatGPT or Google’s Gemini.

Under the hood, DeepEncoder algorithms allow the system to adapt to diverse document sizes and resolutions without sacrificing speed or accuracy. The decoder, named DeepSeek3B-MoE-A570M, employs a mixture-of-experts architecture that distributes knowledge across specialized components for different OCR subtasks. This setup enables parsing of complex, multilingual documents that include graphs, scientific formulas, diagrams and images. To reach its current accuracy and scale, DeepSeek trained on 30 million Portable Document Format pages spanning nearly 100 languages and covering categories from newspapers and scientific handwriting to textbooks and PhD dissertations. While the gains in visual tokenization speed and efficiency are clear, the article notes it remains uncertain how much these improvements will translate into better reasoning performance versus today’s text-token paradigms.

55

Impact Score

Bionic knee integrated with muscle and bone restores more natural movement

MIT researchers unveiled a bionic knee that anchors to bone and taps into residual muscle signals, improving mobility for people with above-the-knee amputations. Early clinical results show faster walking, better stair climbing, and a stronger sense of limb ownership compared with socket-based prostheses.

How Mildred Dresselhaus paid it forward

Institute Professor Mildred Dresselhaus transformed carbon science and built a culture of mentorship shaped by Enrico Fermi’s example. Her legacy spans breakthroughs from nanotubes to twistronics and a generation of scientists she trained.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.