Careful, AI May Ingest Your Entire IT Budget
- By Sheila Lam
- August 14, 2023
The excitement of AI has drastically raised adoption, but also IT spending. IDC’s report indicates the percentage of APAC businesses leveraging AI in their operations has jumped from 39% in 2022 to 76% in 2023. The firm also anticipates the worldwide revenue for AI software, hardware, and services to surpass USD 500 billion by the end of 2023.
“This is the iPhone moment for AI,” said Samuel Lo, general manager of Nvidia AI Technology Center Hong Kong. “We saw how iPhone brought smartphones into our daily lives. We believe ChatGPT is doing the same with AI. It is bringing AI into our daily and business lives.”
Higher demand for AI applications also leads to higher demand for computing power, booming the graphic processing unit (GPU) market. GPU maker and designer Nvidia reported record revenue from its data center division by providing GPUs to process generative AI. Some crypto mining companies have also redirected their GPU investment to cash in on the AI boom.
AI spikes computing resources
The demand for GPU is driven by the training of large language models (LLMs). According to Deepika Giri, head of research, big data & AI at IDC APJ, training and fine-tuning LLM for specific use cases at enterprises can be an arduous task involving massive computing and energy resources.
“ChatGPT 3 takes more than 800 GPUs for 30 days to complete one cycle of training,” added Lo.
Yet, such demand for GPUs is likely to continue and grow. He said most GPT functions currently focus on processing text-based or voice-based information. But it can potentially support other information, like images, videos, animation, or even 3D models, requiring more GPU power.
Running these GPUs also requires a considerable amount of electricity. Research indicates general GPUs consume an average of 400 watts of power—more than five times of CPUs, which consume only 70 watts of power. Other researchers calculated that training a medium-sized generative AI model uses electricity and energy consumption that generates 626,000 tons of CO2 emissions, equivalent to driving five cars throughout their lifetimes.
All these are adding demand for data centers. “AI workload cannot be processed independently within a single server; it works as a cluster,” said Zena Cheng, vice president for channel at SUNeVision, a Hong Kong-based data center operator.
“Within an AI infrastructure, a data center may have 1,000 server nodes, but they need to be well connected with high-throughput low latency connectivity, especially when you want to train the machine,” added Lo.
Thus, Cheng said infrastructure planning—involving the demand for space, computing resources, and electricity— becomes very important if enterprises or cloud providers continue to support generative AI.
Is AI shooting up your cloud bills?
With the complications, demands, and cost of IT infrastructure for AI, many businesses are turning to public clouds.
“Running their AI workloads under an on-premises environment or in public clouds is really a choice for the business,” said Fred Sheu, national technology officer at Microsoft Hong Kong. But adding public clouds often provides the scalability required for enterprises and offers a cloud-based model-as-a-service.
“It takes time to train the model, and time is often a pressing issue for businesses. Thus we are providing different models to help businesses to kickstart their AI journey,” he said.
On top of Microsoft’s Azure Open AI Service, Google’s Vertext AI, and AWS’ Bedrock also provide cloud-based model-as-a-service (MAAS). These offerings charge by usage and allow businesses to take advantage of pre-trained models to build generative AI applications.
Nevertheless, with the popularity of AI, enterprises looking to build different game-changing AI applications could also quickly see their cloud bills rising. According to a State of Cloud Cost study in 2022, nearly half of the global IT executives found it difficult to get cloud costs under control.
“Whatever the approach to leverage the technology, there is an inherent cost associated with the underlying infrastructure, as the model is compute-heavy,” said Giri from IDC. “The price of compute is either in the form of an upfront investment to set up the data center or built into the price of the MAAS offering-there is no escaping it.”
Optimizing AI spending
To manage and optimize IT spending for AI, experts offer different recommendations. Some suggested reviewing your cloud strategy.
“Cloud gives you a taste of how to use AI. [Companies] will eventually run the [AI model] in a semi-private or hybrid cloud environment,” said Cheng. “When it gets to that point, it’s time to look at [your] infrastructure.”
While the starting price of most cloud-based MAAS is low—e.g., Microsoft charges the Dall-E model only USD2 for processing 100 images—the total cost still depends on the model size and parameters for analysis.
Track, measure, and manage
Part of the excitement in generative AI came from the availability of ChatGPT, and enterprises are taking a similar approach by making their subscribed AI services available to users freely to explore ways to apply generative AI. But this approach could be costly.
Sheu suggested budgeting usages of cloud resources using Microsoft’s pricing calculator. Some businesses also build a meter to track usage between users and set alerts or upper limits to control their usage.
He added that AI tools can also be embedded within business applications and integrated with relevant business processes. This way, businesses can prioritize the use of AI that aligns with business strategy, better control the parameters for processing, and manage the ROI of its usage.
Others are turning to prompt engineering tools, like retrieval augmented generation (RAG), to optimize training. Sheu said instead of fine-tuning and further training a base model, tools like RAG allows enterprises to use pre-trained models as a reasoning engine over enterprises’ data to generate responses.
“This technique enables in-context learning without the need for expensive fine-tuning, empowering businesses to use LLMs more efficiently,” he said.
Sheu added with good practices and management, “there shouldn’t be a spike in the cost of applying AI.”
“AI has a lot of different applications. If we can find a way to use AI efficiently, it will become an asset for the business,” Cheng concluded.
Sheila Lam is the contributing editor of CDOTrends. Covering IT for 20 years as a journalist, she has witnessed the emergence, hype, and maturity of different technologies but is always excited about what's next. You can reach her at [email protected].
Image credit: iStockphoto/superburo