SoftBank and AMD validate GPU partitioning for artificial intelligence workloads

SoftBank and AMD are jointly validating a GPU partitioning system for AMD Instinct accelerators that allows a single chip to run multiple artificial intelligence workloads in parallel, tuned to each model’s resource needs. The work targets more efficient use of next generation artificial intelligence infrastructure amid manufacturing delays for AMD’s next Instinct generation.

SoftBank and AMD have begun joint validation of AMD Instinct GPUs for next generation artificial intelligence infrastructure, centered on a GPU partitioning mechanism that lets a single GPU handle multiple artificial intelligence workloads simultaneously. SoftBank created an Orchestrator system that divides AMD Instinct GPU resources according to workload requirements such as model size, number of concurrent executions, and memory needs. The system splits compute workloads across multiple GPU instances running on individual Accelerator Complex Dies, with configurations ranging from single instance mode, called SPX mode, up to eight instances, called CPX mode, which is intended to align GPU utilization with heterogeneous demand.

The architecture extends partitioning to memory, with HBM memory pools divided into individual regions for each GPU instance to prevent latency spikes and interference between workloads. The goal is to avoid the inefficiency of uniform GPU resource allocation, which can cause either GPU resource shortages or waste when different artificial intelligence tasks share a device. SoftBank states that the enhanced Orchestrator runs multiple artificial intelligence applications on a single GPU with minimal resource strain, and SoftBank highlights improved resource allocation for small and mid size language model workloads, although no performance figures are being disclosed yet.

SoftBank plans to explore similar orchestration techniques for other artificial intelligence accelerators beyond AMD hardware, signaling a broader strategy for multi tenant accelerator deployments. A live demonstration is scheduled at the AMD booth during MWC Barcelona 2026 in March 2-5, and SoftBank has published technical details on the architecture and Orchestrator management methods on its Research Institute of Advanced Technology blog. In parallel, AMD’s next generation Instinct MI455X accelerators, which are positioned to compete with NVIDIA’s Vera Rubin, are reportedly facing serious manufacturing problems that are pushing back AMD’s roadmap, with only limited production expected this year and mass production delayed to Q2 2027.

55

Impact Score

Google Vids opens free video generation to all Google users

Google has made Google Vids available to anyone with a Google account, adding free access to video generation with its latest models. The move expands Google’s end-to-end video workflow and increases pressure on rivals that charge for similar tools.

Court warns against chatbot legal advice in Heppner case

A federal court found that chats with a publicly available generative Artificial Intelligence tool were not protected by attorney-client privilege or the work-product doctrine. The ruling highlights litigation risks when executives or employees use chatbots for legal guidance without lawyer supervision.

Newsom orders California to weigh Artificial Intelligence harms in contract rules

Gov. Gavin Newsom has signed an executive order directing California agencies to account for potential Artificial Intelligence harms in state contracting while expanding approved use of generative tools across government. The move follows a dispute involving Anthropic and reflects a broader split between California and the Trump administration on Artificial Intelligence oversight.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.