Adobe Research Empowers Video World Models with State-Space Memory

Adobe Research, in collaboration with Stanford and Princeton, pioneers long-term memory solutions for video world models, boosting Artificial Intelligence scene reasoning and planning.

Researchers from Adobe, Stanford, and Princeton have introduced a novel approach to overcoming the bottleneck of long-term memory in video world models, a core challenge hindering Artificial Intelligence agents´ ability to reason and plan in dynamic environments. While previous video diffusion models achieved high-quality frame prediction, their limited sequence memory due to computationally expensive attention mechanisms severely restricted their practical application in complex, real-world tasks.

The proposed solution, detailed in their paper ´Long-Context State-Space Video World Models,´ centers on incorporating State-Space Models (SSMs) in a block-wise fashion. By breaking video sequences into manageable blocks and maintaining a compressed state across these blocks, the Long-Context State-Space Video World Model (LSSVWM) significantly extends the model´s temporal memory without suffering from the quadratic scaling that plagues attention-based architectures. To retain spatial consistency within and across these blocks, the architecture combines dense local attention, ensuring that local fidelity and scene coherence are preserved throughout extended generations.

To further enhance performance, the research introduces two training strategies: diffusion forcing and frame local attention. Diffusion forcing encourages the model to preserve sequence consistency even from sparse initial contexts, while frame local attention leverages the FlexAttention technique for efficient chunked frame processing and faster training. These innovations were rigorously evaluated on demanding datasets such as Memory Maze and Minecraft, environments specifically designed to challenge long-term recall and reasoning capabilities. Experimental results demonstrate that LSSVWM substantially outperforms existing baselines, enabling coherent, accurate prediction over long horizons without sacrificing inference speed. These breakthroughs position the architecture as a promising foundation for interactive Artificial Intelligence video planning systems and dynamic scene understanding.

74

Impact Score

OpenAI expands ChatGPT ads with self-serve manager

OpenAI is widening its ChatGPT ads pilot with a beta self-serve Ads Manager, new bidding options and broader measurement tools. The push signals a deeper move into advertising as the company expands the program into several international markets.

OpenAI launches Artificial Intelligence deployment consulting unit

OpenAI has created a new consulting and deployment business aimed at helping enterprises build and roll out Artificial Intelligence systems. The move mirrors a similar push by Anthropic and signals a broader effort by model providers to capture more of the enterprise services market.

SK Group warns DRAM shortages could curb memory use

SK Group chairman Chey Tae-won warned that customers may reduce memory consumption through infrastructure and software optimization if DRAM suppliers fail to raise output. Demand from Artificial Intelligence data centers is keeping the market tight as memory makers weigh expansion against the long timelines for new fabs.

BitUnlocker bypasses TPM-only Windows 11 BitLocker

Intrinsec disclosed BitUnlocker, a downgrade attack that can bypass TPM-only Windows 11 BitLocker protections with physical access to a machine. The technique abuses a flaw in Windows recovery and deployment components and relies on older trusted boot code.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.