Companies ranging from the start-up launched by Artificial Intelligence “godmother” Li Fei-Fei to China’s largest technology firms are moving quickly to release new world models. The field is aimed at extending Artificial Intelligence beyond language processing to learning from and comprehending physical reality, with virtual world generation and interaction emerging as a key area of competition.
Alibaba Group Holding unveiled Happy Oyster on Thursday, describing it as an open-ended world model for real-time and “flowy” virtual world creation and interaction. The launch came from Alibaba Token Hub, or ATH, a newly formed business unit created to consolidate the company’s core Artificial Intelligence initiatives. Happy Oyster supported two modes of virtual world creation, according to ATH: a directing mode for building a world based on text and image prompts and a wandering mode for exploring that world.
Unlike conventional Artificial Intelligence video tools, which generate one-off clips that top out at a dozen seconds or a few minutes, Happy Oyster could generate video clips of up to three minutes showing virtual worlds, the company said. In addition, the model could continuously respond to instructions throughout the generation process, as opposed to the conventional, one-shot Artificial Intelligence paradigm, the company said. This meant users could keep developing their imaginative worlds with new ideas during generation, with example prompts including “black crows fly past” and commands for characters to “talk to each other”.
The release followed a move by World Labs, the San Francisco-based company co-founded by Li and launched in early 2024. A day earlier, World Labs unveiled Spark 2.0, an open-source 3D Gaussian splatting rendering engine intended to let less powerful devices such as smartphones view large-scale and detailed 3D images. The back-to-back announcements underscored a fast-moving race to define how world models are built, rendered and used across devices.
