China’s top technology companies are betting big on the NVIDIA Volta platform.
Alibaba Cloud, Baidu, and Tencent are incorporating NVIDIA Tesla V100 GPU accelerators into their data centres and cloud-service infrastructures to accelerate AI for a broad range of enterprise and consumer applications.
At the heart of the new Volta-based systems is the NVIDIA V100 data centre GPU. Built with 21 billion transistors, it provides a 5x improvement over the preceding NVIDIA Pascal architecture P100 GPU accelerators, while delivering the equivalent performance of 100 CPUs for deep learning. This performance surpasses by 4x the improvements that Moore’s law would have predicted over the same period of time.
Inspur, Lenovo and Huawei are using the NVIDIA HGX reference architecture to offer Volta-based accelerated systems for hyperscale data centres. Using HGX as a starter “recipe,” original equipment manufacturer and original design manufacturer partners can work with NVIDIA to more quickly design and bring to market a wide range of qualified GPU-accelerated AI systems for hyperscale data centres to meet the industry’s growing demand for AI cloud computing.
With GPUs based on the NVIDIA Volta architecture offering three times the performance of its predecessor, manufacturers can meet market demand with new products based on the latest NVIDIA technology.
TensorRT boosts inferencing
Speaking at the GPU Technology Conference in Beijing, NVIDIA founder and CEO Jensen Huang also unveiled the new NVIDIA TensorRT 3 AI inference software that sharply boosts the performance and slashes the cost of inferencing from the cloud to edge devices, including self-driving cars and robots.
The combination of TensorRT 3 with NVIDIA GPUs delivers ultra-fast and efficient inferencing across all frameworks for AI-enabled services — such as image and speech recognition, natural language processing, visual search and personalised recommendations. TensorRT and NVIDIA Tesla® GPU accelerators are up to 40 times faster than CPUs at one-tenth the cost of CPU-based solutions.
“Internet companies are racing to infuse AI into services used by billions of people. As a result, AI inference workloads are growing exponentially,” said NVIDIA founder and CEO Jensen Huang. “NVIDIA TensorRT is the world’s first programmable inference accelerator. With CUDA programmability, TensorRT will be able to accelerate the growing diversity and complexity of deep neural networks. And with TensorRT’s dramatic speed-up, service providers can affordably deploy these compute intensive AI workloads.”
More than 1,200 companies have already begun using NVIDIA’s inference platform across a wide spectrum of industries to discover new insights from data and deploy intelligent services to businesses and consumers. Among them are Amazon, Microsoft, Facebook and Google; as well as leading Chinese enterprise companies like Alibaba, Baidu, JD.com, iFLYTEK, Hikvision, Tencent and WeChat.