Alibaba Cloud expands elastic GPU instance portfolio for artificial intelligence and graphics workloads

Alibaba Cloud has detailed a broad lineup of elastic GPU instance families across gn, vgn, and sgn series, targeting artificial intelligence training and inference, high performance computing, and professional graphics workloads with options from virtual GPUs to confidential computing.

Alibaba Cloud has outlined an extensive range of Elastic GPU Service instance families across the gn, vgn, and sgn series, designed to combine GPU and CPU resources for artificial intelligence, high performance computing, and graphics workloads. The platform supports both GPU-accelerated and vGPU-accelerated instance types, largely built on the third-generation SHENLONG architecture, with I/O optimized designs, Non-Volatile Memory Express (NVMe) support, and dual-stack IPv4 and IPv6 networking. Many of the families integrate NVIDIA GRID Virtual Workstation licenses for certified CAD and professional graphics acceleration, and can be used as relatively lightweight GPU-accelerated compute instances for small-scale artificial intelligence inference.

The sgn8ia and sgn7i-vws families use virtual GPUs and shared CPUs to drive concurrent artificial intelligence reasoning and 3D graphics workloads such as remote design and cloud gaming, with sgn8ia relying on NVIDIA Lovelace GPUs and AMD Genoa processors that deliver a clock speed of 3.4 GHz to 3.75 GHz. The sgn7i-vws and vgn7i-vws families pair NVIDIA A10 GPUs with 2.9 GHz Intel Xeon Scalable Ice Lake processors and offer fine-grained GPU slicing, where configurations such as “NVIDIA A10 × 1/12” indicate that a single GPU is partitioned into 12 vGPU segments. The vgn6i-vws family upgrades earlier vgn6i instances to newer GRID drivers on NVIDIA T4 GPUs, supporting 1/4 and 1/2 GPU capacity slices, 4 GB and 8 GB GPU memory options, and a CPU-to-memory ratio of 1:5, with use cases spanning cloud gaming rendering, augmented and virtual reality, and artificial intelligence inference in elastic internet environments.

At the high end, gn8v and gn8v-tee represent Alibaba Cloud’s 8th-generation GPU-accelerated compute-optimized instances, aimed at ultra large language model training and inference, autonomous driving training, and multi-GPU parallel inference on models with more than 70 billion parameters. In the described scenarios, traditional artificial intelligence model training and autonomous driving training use GPUs that deliver computing power of up to 39.5 TFLOPS in the single-precision FP32 format, with each GPU equipped with 96 GB of HBM3 memory and up to 4 TB/s of memory bandwidth, and interconnected by 900 GB/s NVLink links for multi-GPU efficiency gains. The gn8v-tee variants add end-to-end confidential computing using Intel Trust Domain Extensions and NVIDIA confidential computing to secure model and inference data, while all gn8v-class instances adopt CIPU 1.0, 4th-generation Intel Xeon Scalable processors with base frequencies of up to 2.8 GHz and all-core turbo frequencies of up to 3.1 GHz, and network performance that can reach a packet forwarding rate of up to 30,000,000 pps on 8-GPU configurations.

The gn8is family targets artificial intelligence generated content workloads with NVIDIA L20 GPUs, each providing 48 GB of memory and FP8 support, and is positioned for inference on language models with fewer than 70 billion parameters. Earlier gn7e, gn7i, gn7s, gn7, gn6i, gn6e, gn6v, gn5, and gn5i families cover a wide spectrum of training and inference tasks, including deep learning for image classification and autonomous vehicles, high performance scientific computing, cloud gaming, and multimedia encoding. These lines span NVIDIA A10, A30, T4, V100, P100, and P4 accelerators, offer differing CPU-to-memory ratios, and progressively increase network bandwidth and packet forwarding rates through larger instance sizes. Across the portfolio, Alibaba Cloud emphasizes predictable performance from the SHENLONG architecture, flexible combinations of GPU counts and vCPU resources, and broad block storage options such as ESSDs, ESSD AutoPL disks, Regional ESSDs, and in some older families, local NVMe SSDs, to match a range of artificial intelligence, graphics, and compute-intensive workloads.

55

Impact Score

Europe and US discuss biometric data-sharing framework

European Union and US officials are negotiating a border security arrangement that could enable continuous biometric data exchanges on EU citizens. The UK says the US has also requested access to fingerprint records as part of Visa Waiver Program discussions.

Apple plans Intel 18A-P for M7 and 14A for A21

Apple is expected to use Intel’s 18A-P process for M7 chips in MacBook models and Intel’s 14A process for A21 chips in iPhones. The shift points to a broader supplier strategy as Apple moves beyond TSMC for parts of its future silicon roadmap.

Google and other chatbots surface real phone numbers

Generative Artificial Intelligence chatbots are surfacing real phone numbers and other personal details, sometimes by pulling from obscure public sources and sometimes by inventing plausible but wrong contact information. Privacy experts say users have few reliable ways to find out whether their data is in model training sets or to force its removal.

U.S. and China revisit Artificial Intelligence emergency talks

Washington and Beijing are exploring renewed talks on an emergency communication channel for Artificial Intelligence as fears grow over the capabilities of Anthropic’s Mythos model. The shift reflects rising concern in both capitals that competitive pressure is outpacing safeguards.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.