Nvidia pushes CUDA Tile for tensor native programming on Blackwell and future GPUs

Nvidia's CUDA 13.1 release introduces CUDA Tile, a tile centric programming model that aligns GPU software with tensor focused hardware in Blackwell class processors and beyond.

Nvidia has introduced CUDA Tile as part of the CUDA 13.1 release, a major shift in its GPU programming stack that moves beyond the traditional single instruction, multiple thread execution model and toward a tensor native execution approach optimized for Blackwell class processors and future architectures. Instead of manually managing threads, warps, and low level scheduling, developers now describe work in terms of operations on structured data tiles, such as submatrices, while the compiler and runtime automatically map these tile operations to tensor cores, tensor memory accelerators, and the GPU memory hierarchy. Nvidia positions this change as foundational for upcoming platforms like Rubin and Feynman, and as a way to better match modern workloads that rely heavily on dense tensor math rather than scalar operations.

The company contrasts the original CUDA model, where programmers decompose problems into threads and blocks, choose grid and block dimensions, and carefully handle synchronization and memory access patterns, with the new tile centric abstraction that hides execution order and hardware details. From Turing, where tensor units acted as assisting units executing warp issued matrix instructions, to Blackwell, where tensor engines became primary compute engines in tile native execution pipelines with autonomous memory engines, Nvidia has repeatedly reworked scheduling and data movement, which made low level warp and thread tuning increasingly impractical. By elevating CUDA to describe intent at the tile level, Nvidia aims to provide a more uniform abstraction that can sustain performance tuning across multiple generations without exposing device level variability, while still allowing developers to fall back to SIMT based NVVM/LLVM and PTX paths when needed.

At the core of this strategy is CUDA Tile IR, a virtual instruction set that mirrors the role of parallel thread execution for SIMT kernels but for tile oriented workloads, defining tile blocks, their relationships, and allowed transformations while hiding implementation details that may change from one GPU family to another. CUDA 13.1 also debuts cuTile Python, a domain specific language for authoring array and tile based kernels directly in Python, initially focused on artificial intelligence centric algorithms with plans for a C++ implementation and broader use in scientific simulations, signal and image or video processing, and high performance computing workloads that decompose problems into block based computations. In its first release, CUDA Tile support is limited to Blackwell class GPUs with compute capabilities 10.x and 12.x, with Nvidia promising support for more architectures in future updates, and presenting CUDA 13.1 as a milestone in its long term effort to abstract hardware complexity while enabling seamless performance scalability across each GPU generation.

70

Impact Score

The ascent of the artificial intelligence therapist

A wave of new books examines how artificial intelligence is reshaping mental health care, weighing the promise of always-available support against risks of harm, exploitation, and a new kind of digital asylum.

Google highlights 2025 artificial intelligence research breakthroughs and their impact

Google’s 2025 artificial intelligence research breakthroughs, led by models like Gemini 3 and Gemma 3, are shifting the technology from a simple tool to an everyday utility while emphasizing responsible development. The advances are reshaping Google’s products and hint at broader transformations in research, productivity, and education.

Alphabet leans on DeepMind and artificial general intelligence to drive long term value

Alphabet is using DeepMind’s scientific breakthroughs, viral storytelling, and large infrastructure spending to anchor its artificial general intelligence ambitions and stock market momentum. The company is tying Nobel Prize winning research, public sentiment, and major United Kingdom investments into a narrative of durable long term value creation.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.