Nvidia Nemotron 3 open models target specialized artificial intelligence agents

Nvidia’s Nemotron 3 family is a fully open stack of large language, vision, speech, retrieval, and safety models with open weights, data, and recipes aimed at building high‑throughput, reasoning‑focused artificial intelligence agents across edge, cloud, and data center deployments.

Nvidia Nemotron is a family of open models with open weights, training data, and detailed recipes designed to help developers build specialized artificial intelligence agents with high efficiency and accuracy. The models are transparent, with weights and datasets available on Hugging Face, and technical reports that document how to recreate the systems end to end. The latest Nemotron 3 generation uses a hybrid Mamba Transformer mixture of experts architecture and a 1M-token context to support complex, high-throughput agentic applications, and the models can be deployed with open frameworks such as vLLM, SGLang, Ollama, and llama.cpp on Nvidia GPUs across edge, cloud, and data center environments, or consumed as Nvidia Nim microservice endpoints.

The Nemotron 3 lineup is tuned for different reasoning workloads: Nano prioritizes cost efficiency and high accuracy for targeted tasks, Super is aimed at high-accuracy multi-agent reasoning and deep research, and Ultra is built for the highest accuracy in multi-agent enterprise workflows such as customer service automation, supply chain management, and IT security. Additional variants extend Nemotron beyond text, including Nemotron Nano VL for document intelligence and video understanding, Nemotron RAG models for extraction, embedding, reranking and multimodal document intelligence that lead benchmarks like ViDoRe V1, ViDoRe V2, MTEB and MMTEB, Nemotron Safety models for jailbreak detection, multilingual content moderation, privacy and topic control, and Nemotron Speech models optimized for high-throughput, ultra-low latency automatic speech recognition, text-to-speech, and neural machine translation for agentic artificial intelligence applications. These offerings are accessible through Nvidia Nim APIs and third-party inference providers such as Baseten, DeepInfra, Fireworks AI, FriendliAI, and Together AI, allowing teams to scale without managing their own infrastructure.

Nvidia pairs the models with one of the broadest commercially usable open collections of synthetic data for agentic artificial intelligence, including over 10T language tokens and 18 million supervised fine-tuning data samples across pre- and post-training, personas, safety, reinforcement learning, and retrieval-augmented generation datasets. The portfolio spans multilingual reasoning, coding, and safety corpora, fully synthetic personas aligned with real-world demographic and cultural distributions for sovereign artificial intelligence efforts in regions such as USA, Japan, and India, high-quality visual question answering and optical character recognition annotations for vision-language models, and curated safety and reinforcement learning data for moderation, threat awareness, and tool-using agents. Developer tools like Nvidia NeMo for lifecycle management and TensorRT-LLM for real-time optimized inference, along with cookbooks, notebooks, workshops, and learning paths for building report generators, retrieval-augmented generation systems, and bash computer-use agents, round out the ecosystem. Nvidia emphasizes trustworthy artificial intelligence as a shared responsibility, provides system and model cards plus safety documentation, and notes a collaboration with Google DeepMind to watermark generated videos from its API catalog.

68

Impact Score

Siemens debuts digital twin composer for industrial metaverse deployments

Siemens has introduced digital twin composer, a software tool that builds industrial metaverse environments at scale by merging comprehensive digital twins with real-time physical data, enabling faster virtual decision making. Early deployments with PepsiCo report higher throughput, shorter design cycles and reduced capital expenditure through physics-accurate simulations and artificial intelligence driven optimization.

Cadence builds chiplet partner ecosystem for physical artificial intelligence and data center designs

Cadence has introduced a Chiplet Spec-to-Packaged Parts ecosystem aimed at simplifying chiplet design for physical artificial intelligence, data center and high performance computing workloads, backed by a roster of intellectual property and foundry partners. The program centers on a physical artificial intelligence chiplet platform and framework that integrates prevalidated components to cut risk and speed commercial deployment.

Patch notes detail split compute and IO tiles in Intel Diamond Rapids Xeon 7

Linux kernel patch notes reveal that Intel’s upcoming Diamond Rapids Xeon 7 server processors separate compute and IO tiles and adopt new performance monitoring and PCIe 6.0 support. The changes point to a more modular architecture and a streamlined product stack focused on 16-channel memory configurations.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.