DeepSeek explores image tokens to improve Artificial Intelligence memory

DeepSeek released an optical character recognition model that packs text into images to store more context with fewer tokens, a technique the company says could improve Artificial Intelligence memory and efficiency.

DeepSeek released an optical character recognition model last week that extracts text from images and turns it into machine-readable words. The paper and early reviews report that the model performs on par with top systems on key benchmarks for optical character recognition. DeepSeek presents the model primarily as a test bed for a different approach to storing and retrieving information inside Artificial Intelligence systems.

The paper’s central innovation is replacing large sets of text tokens with image-based tokens and a tiered compression scheme. Instead of representing words as thousands of discrete tokens, the system packs written information into image form, similar to photographing pages of a book, and stores older or less critical content in progressively blurrier representations. The company argues this lets models retain nearly the same information using far fewer tokens, which could cut the computing resources required to run long conversations and reduce the carbon footprint associated with Artificial Intelligence memory. The paper frames the method as a possible remedy for so-called context rot, where long interactions cause models to forget or muddle earlier information.

Researchers have already noted the approach’s promise while cautioning that it is an early exploration. Andrej Karpathy praised the paper on X, suggesting images may be better than text as inputs for large language models and criticizing text tokens as wasteful. Manling Li of Northwestern University called the work a new framework for addressing memory challenges, and Zihan Wang highlighted its potential to help continuous conversations remember more effectively. DeepSeek also reports that the system can generate more than 200,000 pages of training data a day on a single GPU. Based in Hangzhou, DeepSeek previously surprised the industry with DeepSeek-R1 earlier this year, and the new paper continues its push into low-resource research directions while noting more work is needed to make memory recall more dynamic and importance-aware.

53

Impact Score

EU delays parts of Artificial Intelligence Act

EU lawmakers have agreed to delay high-risk obligations under the EU Artificial Intelligence Act while easing compliance for smaller and mid-sized firms. This is expected before August 2026.

Microsoft builds its own Artificial Intelligence stack

Microsoft introduced in-house Artificial Intelligence models and a new quantum chip as it works to reduce reliance on OpenAI. The move is positioned as a way to lower costs, improve margins, and strengthen its enterprise technology strategy.

Climate case tests Artificial Intelligence discovery in court

A federal judge paused an order requiring the Conservation Law Foundation to produce an expert witness’s generative Artificial Intelligence prompts in its climate lawsuit against Shell. The dispute could shape how courts treat Artificial Intelligence data in expert discovery.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.