NVIDIA and Microsoft are working on a new hyperscale GPU accelerator that will provide hyperscale data centres with a fast, flexible path for artificial intelligence (AI).
The new HGX-1 hyperscale GPU accelerator is an open-source design released in conjunction with Microsoft’s Project Olympus.
HGX-1 does for cloud-based AI workloads what ATX — Advanced Technology eXtended — did for PC motherboards when it was introduced more than two decades ago. It establishes an industry standard that can be rapidly and efficiently embraced to help meet surging market demand.
The new architecture is designed to meet the exploding demand for AI computing in the cloud — in fields such as autonomous driving, personalised healthcare, superhuman voice recognition, data and video analytics, and molecular simulations.
“AI is a new computing model that requires a new architecture. The HGX-1 hyperscale GPU accelerator will do for AI cloud computing what the ATX standard did to make PCs pervasive today. It will enable cloud-service providers to easily adopt NVIDIA GPUs to meet surging demand for AI computing,” said said Jen-Hsun Huang, Founder and Chief Executive Officer of NVIDIA.
“The HGX-1 AI accelerator provides extreme performance scalability to meet the demanding requirements of fast-growing machine learning workloads, and its unique design allows it to be easily adopted into existing data centers around the world,” wrote Kushagra Vaid, General Manager and Distinguished Engineer, Azure Hardware Infrastructure, Microsoft, in a blog post.
For the thousands of enterprises and startups worldwide that are investing in AI and adopting AI-based approaches, the HGX-1 architecture provides unprecedented configurability and performance in the cloud.
Powered by eight NVIDIA Tesla P100 GPUs in each chassis, it features an innovative switching design — based on NVIDIA NVLink interconnect technology and the PCIe standard — enabling a CPU to dynamically connect to any number of GPUs. This allows cloud service providers that standardize on the HGX-1 infrastructure to offer customers a range of CPU and GPU machine instance configurations.
Cloud workloads are more diverse and complex than ever. AI training, inferencing and HPC workloads run optimally on different system configurations, with a CPU attached to a varying number of GPUs. The highly modular design of the HGX-1 allows for optimal performance no matter the workload. It provides up to 100x faster deep learning performance compared with legacy CPU-based servers, and is estimated at one-fifth the cost for conducting AI training and one-tenth the cost for AI inferencing.