Enterprises are generating vast volumes of unstructured data and Artificial Intelligence workloads are becoming increasingly data-intensive. The article frames object storage as a cost-effective option that historically served archives, backups and data lakes but has lacked the performance needed for fast-paced Artificial Intelligence training and inference. The need for scalable, portable storage between on premises infrastructure and the cloud is driving exploration of new approaches to object storage performance.
Remote direct memory access, or RDMA, for S3-compatible storage is presented as a solution that accelerates the S3 application programming interface-based storage protocol. By offloading data transfers from the host CPU and using RDMA-enabled networking, the approach promises higher throughput per terabyte, improved throughput per watt, lower cost per terabyte and much lower latency than traditional TCP-based transports. Nvidia has developed RDMA client and server libraries; storage partners have incorporated the server libraries into their products and client libraries run on GPU compute nodes to enable faster data access for Artificial Intelligence workloads and better GPU utilization. The article notes that initial libraries are optimized for Nvidia GPUs and networking while the architecture remains open for other vendors and contributors.
Several leading object storage vendors are adopting the technology. Cloudian, Dell Technologies and HPE are integrating RDMA for S3-compatible libraries into HyperStore, ObjectScale and Alletra Storage MP X10000 respectively. Executives quoted in the piece emphasize scalability, portability and reduced total cost of ownership for large-scale Artificial Intelligence deployments and AI factories. Nvidia’s libraries are available to select partners now and are expected to be generally available via the Nvidia CUDA Toolkit in January, alongside information about a new Nvidia object storage certification as part of the Nvidia-Certified Storage program.
