Amazon Web Services (AWS) has announced general availability of G4 instances, a new NVIDIA T4 GPU-powered Amazon Elastic Compute Cloud (Amazon EC2) instance designed to help accelerate machine learning inference and graphics-intensive workloads.G4 instances provide cost-effective machine learning inference for applications, such as adding metadata to an image, object detection, recommender systems, automated speech recognition, and language translation. They are a cost-effective platform for building and running graphics-intensive applications, such as remote graphics workstations, video transcoding, photo-realistic design, and game streaming in the cloud.
Machine learning involves two processes that require compute – training and inference.
- Training entails using labeled data to create a model that is capable of making predictions, a compute-intensive task that requires powerful processors and high-speed networking.
- Inference is the process of using a trained machine learning model to make predictions, which typically requires processing a lot of small compute jobs simultaneously, a task that can be most cost-effectively handled by accelerating computing with energy-efficient NVIDIA GPUs.
With the launch of P3 instances in 2017, AWS was the first to introduce instances optimised for machine learning training in the cloud with powerful NVIDIA V100 Tensor Core GPUs, allowing customers to reduce machine learning training from days to hours.
However, inference is what actually accounts for the vast majority of machine learning’s cost. According to customers, machine learning inference can represent up to 90 percent of overall operational costs for running machine learning workloads.
New G4 instances feature the latest generation NVIDIA T4 GPUs, a second-generation Tensor Core GPU that achieves great performance for AI applications while maintaining CUDA programmability.
With up to 130 TOPS of INT8 performance, it features mixed-precision tensor processing required to accelerate the constantly evolving innovation, diversity and complexity of AI-based applications such as image classification, object detection, natural language understanding, automated speech recognition and recommender systems.
G4 also provide an ideal compute engine for graphics-intensive workloads, offering up to a 1.8x increase in graphics performance and up to 2x video transcoding capability over the previous generation G3 instances.
These performance enhancements enable customers to use remote workstations in the cloud for running graphics-intensive applications such as Autodesk Maya or 3D Studio Max, as well as efficiently create photo-realistic and high-resolution 3D content for movies and games.
“We focus on solving the toughest challenges that hold our customers back from taking advantage of compute intensive applications. With new G4 instances, we’re making it more affordable to put machine learning in the hands of every developer. And with support for the latest video decode protocols, customers running graphics applications on G4 instances get superior graphics performance over G3 instances at the same cost.,” said Matt Garman, Vice President, Compute Services of AWS.