Organizations implementing AI applications have several considerations to ponder in choosing the proper infrastructure. But one critical consideration is making a distinction between the training portion of AI and inferencing.
This is the view of Michael Lang, solutions architecture manager at NVIDIA, speaking on a panel discussion on implementing AI at the recent NexGen Connectivity Forum. The forum comprised both industry participants and solution providers.
Keep training and inference apart
The training and learning piece of AI, said Lang, is very different and often requires a different infrastructure environment to the one used for inferencing with AI.
“The training and learning piece is about HPC and data-intensive needs,” said Lang. “That means big data centers and infrastructure and big capability.”
The inferencing piece, however, can be completely different. Often, this piece requires much less data, but the imperative is latency.
The AI needed to respond instantly, so having the infrastructure and connectivity to deliver rapidly often meant that “analysis at the edge is the key.”
Different geographies also create additional conditions for making these decisions.
Singapore, for example, has a smaller geography with world-class connectivity. In Australia — where Lang is based — distances can be daunting, and connectivity is not generally as good. Here, edge could be the optimal solution.
“Latency is all about a high-quality response, and that may have an implication for public cloud, where latency is higher,” said Lang.
Look at data sensitivity, density, cost, and time-to-value issues
Eric Hui, director of IoT business development for Asia Pacific at Equinix, agreed that where locations were “cloud dense,” inferencing could be effective without edge.
Another issue in making infrastructure decisions, said Hui, was data sensitivity, which could mean in some cases that a private cloud or an on-premises infrastructure should be chosen.
Data density was another issue that tilts the decision in favor of co-location or on-prem, said Hui.
Cost is another consideration. While a small public cloud could be more cost-effective, on-premises and co-location can be considered if the AI models require “a lot of iterations to create a secret sauce.”
Nvidia’s Michael Lang said another consideration was “time to value,” which were often business decisions.
“People say, ‘how big an architecture’ and the answer is ‘how far do you want the car to go’,” he said.
“Often, it depends on what your time frame is. You might do training in the cloud, and it might take a couple of days, or weeks, to run, while in a data center, you could run the same training in minutes.”
Mix cloud and on-premises approaches to fast track
A third panelist, Laurence Liew, the director of AI Innovation and Makerspace at AI Singapore, outlined his organization’s work with Singapore companies.
Liew said he worked with many small and medium-sized enterprises which initially used the cloud for their AI models but then looked at the business case of going on-premises when the projects developed and were being deployed.
Cloud, he said, had the advantage of being able to “spin up fast” in the early stages of a project.
“If there is no preference and this is their first project, then our recommendation is to go to the cloud because it’s a lot easier,” said Liew.
“When you want to buy hardware, if you go on-premises, it could take a year before it arrives, and you don’t want that to slow you down.”
Lachlan Colquhoun is the Australia and New Zealand correspondent for CDOTrends and HR&DigitalTrends, and the editor of NextGen Connectivity. His fascination is with how businesses are reinventing themselves through digital technology and collaborate with others to become completely new organizations. You can reach him at [email protected].
Image credit: iStockphoto/Feodora Chiosea