vLLM supports a broad set of hardware platforms for inference and serving. The documentation lists GPU support with specific backends for NVIDIA CUDA, AMD ROCm and Intel XPU. CPU targets include Intel/AMD x86, ARM AArch64, Apple silicon and IBM Z (S390X). In addition to GPU and CPU support, the project notes compatibility with Google TPU and AWS Neuron.
The documentation also describes a hardware plugin model. It states that backends live outside the main vLLM repository and follow the Hardware-Pluggable RFC. A table in the installation page enumerates available accelerator plugins and their packaging or install status. Ascend NPU is published as the vllm-ascend package with a linked GitHub repository. Intel Gaudi (HPU) and MetaX MACA GPU are marked as not available on PyPI and must be installed from source, with repositories provided. Rebellions ATOM / REBEL NPU is listed with a vllm-rbln package and a repository link. IBM Spyre Artificial IntelligenceU appears in the table with a vllm-spyre package and a repository link.
The page groups hardware information into GPU and CPU sections and provides dedicated subpages for each platform, as well as separate pages for Google TPU and AWS Neuron. Where plugins are not published on PyPI, the documentation indicates installation from source. The installation document centralizes supported accelerators and points users to external GitHub repositories for third-party backends, while clarifying which plugins are packaged and which require source installation.