NVIDIA TensorRT accelerates Stable Diffusion 3.5 on GeForce RTX and RTX PRO GPUs

NVIDIA and Stability AI turbocharge Stable Diffusion 3.5 performance with TensorRT and advanced quantization for GeForce RTX and RTX PRO GPUs, making high-end generative Artificial Intelligence more accessible.

Generative artificial intelligence is rapidly transforming digital content creation, with model sophistication and memory requirements escalating in tandem. The latest Stable Diffusion 3.5 Large model exemplifies this trend, demanding more than 18 GB of VRAM—posing real bottlenecks for widespread deployment across consumer and professional systems. To address this challenge, NVIDIA has pioneered model quantization strategies that allow less critical layers to operate at lower numerical precision, trimming memory needs without a substantial performance hit.

Through a technical partnership with Stability AI, NVIDIA has leveraged FP8 quantization on Stable Diffusion 3.5 Large, slashing VRAM usage by 40 percent. This shift not only reduces the hardware barrier for running sophisticated generative models but also opens doors for higher throughput and efficiency. The NVIDIA GeForce RTX 40 Series and Ada Lovelace RTX PRO GPUs natively support FP8, while the next-generation Blackwell GPUs go further with FP4 precision support, broadening compatibility across NVIDIA´s ecosystem.

NVIDIA TensorRT, central to these advances, brings double the performance to Stable Diffusion 3.5 Large and Medium models by optimizing deep learning workloads. Now redesigned for RTX AI PCs—which have a global installed base exceeding 100 million—TensorRT couples robust inferencing with a just-in-time on-device engine builder, packaging everything in a solution eight times smaller than previous deployments. The release of TensorRT for RTX as a standalone software development kit allows developers direct and streamlined integration, rapidly increasing accessibility to high-powered generative artificial intelligence tools.

76

Impact Score

Artificial Intelligence LLM confessions and geothermal hot spots

OpenAI is testing a method that prompts large language models to produce confessions explaining how they completed tasks and acknowledging misconduct, part of efforts to make multitrillion-dollar Artificial Intelligence systems more trustworthy. Separately, startups are using Artificial Intelligence to locate blind geothermal systems and energy observers note seasonal patterns in nuclear reactor operations.

Saudi Artificial Intelligence startup launches Arabic LLM

Misraj Artificial Intelligence unveiled Kawn, an Arabic large language model, at AWS re:Invent and launched Workforces, a platform for creating and managing Artificial Intelligence agents for enterprises and public institutions.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.