Meta has just taken AI up another notch with its newly-designed and built AI Research SuperCluster (RSC) which is believed to be among the fastest AI supercomputers running today with 760 NVIDIA DGX A100 systems packing 6,080 NVIDIA A100 GPUs.
And it gets better. By mid-2022, RSC will be the fastest AI supercomputer when fully built with a mind-boggling 16,000 NVIDIA A100 GPUs.
While it’s already in use, a fully-built RSC later this year will be able to train AI models with more than a trillion parameters.
“RSC will help Meta’s AI researchers build new and better AI models that can learn from trillions of examples; work across hundreds of different languages; seamlessly analyze text, images, and video together; develop new augmented reality tools; and much more. Our researchers will be able to train the largest models needed to develop advanced AI for computer vision, NLP (natural language processing), speech recognition, and more,” said Meta in a blog post.
More than just supplying computer power, the A100 GPUs are linked by an NVIDIA Quantum 200Gb/s InfiniBand network to deliver 1,895 petaflops of TF32 performance to accelerate the work of the AI research teams.
In 2017, Meta built the first generation of this AI research infrastructure with 22,000 NVIDIA V100 Tensor Core GPUs that handles 35,000 AI training jobs a day.
Early benchmarks indicate that RSC can train large NLP models 3x faster and run computer vision jobs 20x faster than the first generation system.
“Ultimately, the work done with RSC will pave the way toward building technologies for the next major computing platform — the metaverse, where AI-driven applications and products will play an important role,” said Meta.