Intel unveils ‘Crescent Island’ data center GPU for Artificial Intelligence inference

Intel introduced a new GPU tailored for Artificial Intelligence inference, code-named Crescent Island, with a focus on performance per watt and cost efficiency. The Xe3P-based part uses LPDDR5X memory and is slated to arrive in the second half of 2026.

Intel has announced a new data center GPU aimed at Artificial Intelligence inference workloads. Debuting at the ODP Global Summit, the part, code-named Crescent Island, is built on Intel’s Xe3P graphics architecture and pairs that design with low-power LPDDR5X memory. Intel positions energy efficiency and performance per watt as defining characteristics, aligning the product with the industry’s pivot from large-scale training toward real-time, agentic Artificial Intelligence inference.

Crescent Island is scheduled to launch in the second half of 2026 and will feature 160 GB of LPDDR5X memory, according to Intel. The company describes Xe3P as a performance-oriented evolution of its Xe3 architecture used in Panther Lake CPUs. Intel CTO Sachin Katti framed the shift succinctly, noting that growing demand for real-time inference requires heterogeneous systems, an open software stack, and silicon matched to specific tasks. Intel says the new data center GPU is designed to provide efficient headroom as token volumes increase in production inference.

The announcement arrives against the backdrop of Intel’s broader Xe initiative, which began in 2018 with details shared in 2019. Intel’s first Xe data center GPU, Ponte Vecchio, used the Xe-HPC microarchitecture and advanced packaging, including EMIB and Foveros, on the Intel 4 process node with some components from TSMC’s 5 nm. Ponte Vecchio underpins Argonne National Laboratory’s Aurora supercomputer, which deployed 63,744 GPUs across more than 10,000 nodes. Aurora delivered 585 petaflops in November 2023, crossed the exascale threshold in June 2024, and currently ranks third on the Top500 list. Aurora initially targeted Xeon Phi, but the project shifted to Ponte Vecchio after Xeon Phi’s cancellation.

Key details of Crescent Island’s physical configuration remain unclear, including whether it will ship as multiple smaller devices or as a single large GPU. Memory bandwidth will be a metric to watch, given its importance in Artificial Intelligence workloads. Intel’s use of LPDDR5X, typically found in PCs and smartphones, is notable for its potential power savings and affordability. LPDDR5X can reach up to 14.4 Gbps per pin, and with common 32 GB modules, Intel would need multiple devices to reach 160 GB. By contrast, AMD and Nvidia are planning substantial HBM4 capacities in next-generation GPUs, with the MI450 reportedly up to 432 GB and Rubin Ultra up to 1 TB. While HBM4 promises superior bandwidth, its rising costs and supply constraints may make LPDDR5X an attractive alternative for efficient, cost-effective Artificial Intelligence inference at scale.

56

Impact Score

Red Hat Artificial Intelligence 3 tackles inference complexity

Red Hat introduced Red Hat Artificial Intelligence 3 to move enterprise models from pilots to production, with a strong focus on scalable inference on Kubernetes. The release adds llm-d, a unified API on Llama Stack, and tools for Model-as-a-Service delivery.

Nvidia DGX Spark arrives for world’s Artificial Intelligence developers

Nvidia is shipping DGX Spark, a compact desktop system that delivers a petaflop of Artificial Intelligence performance and unified memory to bring large model development and agent workflows on premises. Partner systems from major PC makers and channel partners broaden availability starting Oct. 15.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.