30 Oct Amazon And NVIDIA Simplify Machine Learning
NVIDIA NVDA +3.14% and Amazon.com AMZN +13.13% have announced new Machine Learning software stacks in the NVIDIA GPU Cloud (NGC), and a new 8 Volta GPU EC2 instance for immediate availability, respectively. While this announcement was completely expected, it is an important milestone along the road to simplifying and lowering the costs of Machine Learning development and deployment for AI projects. When NVIDIA NVDA +3.14% announced the NVIDIA NVDA +3.14%GPU Cloud last May at GTC, I explained in this blog that the purpose was to create a registry of compatible and optimized ML software containers which could then, in theory, run on the cloud of users’ choice. That vision has now become a reality, at least for Amazon.com’s AMZN +13.13% Amazon Web Services – +0% customers. I expect other Cloud Service Providers to follow soon, given the momentum in the marketplace for the 120 TFLOP Volta GPU’s.
As anyone who has delved into Machine Learning can tell you, there are two hurdles that you must clear to build a useful neural network. Assuming you’ve already prepared a massive trove of tagged data to feed the training process, and have mastered the art of Deep Neural Network design, you’ll need hardware. In fact you’ll need lots of hardware; expensive hardware you’d have to buy, install, configure, power and maintain. This is where AWS comes in. Its new P3 GPU instances come with 1, 4, or 8 Volta GPUs configured across a fast (25Gb/S) NVLINK2 scalable interconnect, delivering up to a stunning 960 trillion operations per second for serious ML work. That means your training runs will be done in hours instead of days or weeks, getting your AI ready much quicker. It is still not real-time training, but we are getting there.
Great, so you figured out you should just rent the hardware—that’s smart. But now you need to select, find, and configure a lot of finicky software. And each software component has to play nice with the myriad of other pieces. So, start with the right Linux OS, configure the correct divers, get the software framework from Git Hub, and don’t forget to download the DNN libraries. NO! Not that version! It may not be compatible with everything else you just loaded, and isn’t optimized for the GPU you selected. You DID verify that the entire stack is all inter-compatible, right? I mean, each component changes constantly; that’s the beauty and curse of open software!
You can see why NVIDIA NVDA +3.14% has invested a lot of time and money to build, configure, optimize, and test all the ML software for each and every major ML Framework—ensuring that it is all self-consistent and optimized for each GPU. Just go to the NGC, create a free account, click on which framework you need, where you want to run it, and NGC will give you an ID which tells AWS what container to load on your shiny new P3 instance. Did I say “free”? Yes, use of the NGC services is free to all.