The AlphaGo program by DeepMind leveraged it as part of a repertoire of tricks to beat Go champion Lee Sedol in 2016, while Google uses it to juggle hundreds of disparate system parameters to keep critical systems cooled within its data centers.
Reinforcement learning, which focuses on the notion of cumulative reward, is a mature machine learning technology that is unlike supervised machine learning in that it thrives in areas where there is no correct answer. And in a world where there is often no “right” answer, reinforcement learning could be the next big thing in business, according to a recent article on the Harvard Business Review.
Written by Dr. Kathryn Hume and Associate Professor Matthew E. Taylor, it outlined the potential of reinforcement learning, how top businesses are using it to solve tough problems, and how organizations can spot opportunities to deploy it.
Understanding reinforcement learning
So how can reinforcement learning play a role in business? Answering this question requires first understanding how it works. Hume and Taylor explain reinforcement learning this way: “[With reinforcement learning] there’s no correct answer to learn from. Reinforcement learning systems produce actions, not predictions — they’ll suggest the action most likely to maximize (or minimize) a metric.”
Because reinforcement learning systems figure things out through trial and error, it works best in situations where an action or sequence of events is rapidly established, and feedback is obtained quickly to determine the next course of action – there is no need for realms of historical data for reinforcement learning to crunch through.
A stock market algorithm that can make hundreds of actions per day is hence an optimal use case for reinforcement learning while optimizing customer lifetime value over years is not. It is worth noting that reinforcement learning does not work well with ambiguity but is superb at optimization tasks using established metrics in the form of inputs, actions, and rewards.
This makes reinforcement learning ideal for the automation of processes or for managing dense, data-generating business processes. In Google’s case, systems track outside weather conditions and air pressure alongside the fluid internal dynamics of a typical data center environment to modify controls such as fan speed, air-handling units, and scores of environmental systems to keep servers cooled with a minimum expenditure of energy.
“Reinforcement learning algorithms are able to pick up on nuances that would be too hard to describe with formulas and rules,” explained Hume and Taylor. And as Google demonstrated, it is perfect for working with the unique characteristics and differing physical locations of its global network of data centers.
To be clear, Hume and Taylor do not recommend reinforcement learning if a particular problem can be tackled with other optimization techniques, or even other machine learning approaches. They suggest that businesses start with a list of business processes that involve a sequence of steps, shortlisting those that require frequent actions and opportunities for feedback.
As with all AI approaches, there is a cost to implementing reinforcement learning. Businesses will need to ask whether reinforcement learning is economical for what they have in mind: “To answer whether the investment will pay off, technical teams should take stock of computational resources to ensure you have the compute power required to support trials and allow the system to explore and identify the optimal sequence.”
Finally, organizations will need to be patient. Though a mature technology, reinforcement learning is hardly magic and will not find the optimal path from day one. False starts are possible, too. However, deployed well and given time, reinforcement learning can potentially find surprising, creative solutions to help organizations outpace their competition.
Paul Mah is the editor of DSAITrends. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose. You can reach him at [email protected].
Image credit: iStockphoto/PhonlamaiPhoto