Amazon Web Services (AWS) and NVIDIA have unveiled an expansion of their strategic collaboration to bring new supercomputing infrastructure, software, and services for generative artificial intelligence (AI). The partnership will bring together AWS’s broad range of machine learning (ML) capabilities and NVIDIA’s expertise in accelerated computing to allow customers to leverage the power of AI on the cloud.
The joint effort will harness the best of NVIDIA and AWS technologies that are optimal for training foundation models and building generative AI applications- NVIDIA’s newest multi-node systems featuring next-generation GPUs, CPUs and AI software, AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability.
The key points of the collaboration will advance AI technologies:
- AWS will be the first cloud provider to bring NVIDIA® GH200 Grace Hopper Superchips with new multi-node NVLink™ technology to the cloud. The new NVIDIA GH200 NVL32 multi-node platform connects 32 Grace Hopper Superchips with NVIDIA NVLink and NVSwitch™ technologies into one instance. The platform will be available on Amazon Elastic Compute Cloud (Amazon EC2) instances connected with Amazon’s powerful networking (EFA), supported by advanced virtualization (AWS Nitro System), and hyper-scale clustering (Amazon EC2 UltraClusters), equipping joint customers to scale to thousands of GH200 Superchips.
- NVIDIA and AWS will collaborate to host NVIDIA DGX™ Cloud—NVIDIA’s AI-training-as-a-service—on AWS. It will be the first DGX Cloud featuring GH200 NVL32, providing developers the biggest shared memory in a single instance. DGX Cloud on AWS will accelerate training of cutting-edge generative AI and large language models that can achieve beyond 1 trillion parameters.
- NVIDIA and AWS are partnering on Project Ceiba to design the world’s fastest GPU-powered AI supercomputer—an at-scale system with GH200 NVL32 and Amazon EFA interconnect hosted by AWS for NVIDIA’s own research and development team. This first-of-its-kind supercomputer—featuring 16,384 NVIDIA GH200 Superchips and capable of processing 65 exaflops of AI—will be used by NVIDIA to accelerate its next wave of generative AI innovation.
- AWS will introduce three additional new Amazon EC2 instances: P5e instances, powered by NVIDIA H200 Tensor Core GPUs, for large-scale and advanced generative AI and HPC workloads, and G6 and G6e instances, powered by NVIDIA L4 GPUs and NVIDIA L40S GPUs, respectively, for a wide set of applications such as AI fine-tuning, inference, graphics and video workloads. G6e instances are particularly suitable for developing 3D workflows, digital twins and other applications using NVIDIA Omniverse™, a platform for connecting and building generative AI-enabled 3D applications.