Magma: Foundation Model for Multimodal AI Agents

Explore how Magma enables AI systems to navigate both digital and physical tasks, representing a significant leap for Artificial Intelligence.

Microsoft Research has unveiled a new foundational model called Magma, designed to enable artificial intelligence agents to operate seamlessly across digital and physical environments. Magma represents a leap forward by integrating vision, language, and action (VLA) models, allowing AI systems to understand and interact with user interfaces and physical objects alike. With the ability to suggest actions such as button clicks and orchestrate robotic tasks, Magma positions itself as a significant advancement in AI, potentially transforming how AI assistants function in diverse settings.

The foundation of Magma is a large and diverse pretraining dataset, setting it apart from previous models that were specific task-oriented. The innovation of Magma lies in its capacity to generalize across various environments, outstripping its predecessors in performance on tasks such as user interface navigation and robotic manipulation. One of the standout features of Magma is its use of Set-of-Mark (SoM) and Trace-of-Mark (ToM) annotations, which provide the model with a structured understanding of environments and tasks, enhancing its ability to plan and execute actions.

Magma’s introduction is part of a larger strategy by Microsoft Research to enhance the capabilities of agentic AI systems, with potential applications in both developer tools and everyday AI assistants. By enabling AI to reason, explore, and take actions effectively, Magma could pave the way for more capable and robust AI systems in the future. It is currently available for researchers and developers on Azure AI Foundry Labs and Hugging Face, inviting experimentation with this cutting-edge technology.

77

Impact Score

LLM-PIEval: a benchmark for indirect prompt injection attacks in large language models

Large language models have increased interest in Artificial Intelligence and their integration with external tools introduces risks such as direct and indirect prompt injection. LLM-PIEval provides a framework and test set to measure indirect prompt injection risk and the authors release API specifications and prompts to support wider assessment.

NVIDIA may stop bundling memory with gpu kits amid gddr shortage

NVIDIA is reportedly considering supplying only bare silicon to its aic partners rather than the usual gpu and memory kit as gddr shortages constrain fulfillment. The move follows wider industry pressure from soaring dram prices and an impending price increase from AMD of about 10% across its gpu lineup.

SK Hynix to showcase 48 Gb/s 24 Gb GDDR7 for Artificial Intelligence inference

SK Hynix will present a 24 Gb GDDR7 chip rated for 48 Gb/s at ISSCC 2026, claiming a symmetric dual-channel design and updated internal interfaces that push past the expected 32 to 37 Gb/s. The paper positions the device for mid-range Artificial Intelligence inference and SK Hynix will also show LPDDR6 running at 14.4 Gb/s.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.