Google Unveils AI Framework for Reinforcement Learning

Google Research has published a new open-source AI framework for reinforcement learning that can be massively scaled to thousands of machines, delivering up to 80 times the performance over previous implementations.

Reinforcement learning

The details are outlined in a white paper titled “SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference”. The code for SEED RL can be found on Github with examples based on Google Cloud here.

Reinforcement learning is typically run on CPUs and GPUs currently, with the former is used to update the parameters of the interface models before sending specific data to models for training on the GPUs. This approach has several drawbacks, however, according to Lasse Espeholt, a research engineer at Google Research.

For a start, the use of CPUs is much less efficient and slower than using accelerators and becomes a problem as models become larger and more computationally intensive. Moreover, the bandwidth required for sending parameters and intermediate model states can be a bottleneck, even as the separation of different tasks on one machine means that resources are not utilized optimally, he explained.

The SEED RL architecture solves these drawbacks by performing the processing on a neural network interface centrally on specialized hardware. The result? The ability to handle up to a million queries per second on a single machine, while scalability going up to thousands of cores, and with the ability to go up to thousands of machines for training performance of millions of frames per second.

“[This enables] accelerated inference and [avoids] the data transfer bottleneck by ensuring that the model parameters and state are kept local. While observations are sent to the learner at every environment step, latency is kept low due to a very efficient network library based on the gRPC framework with asynchronous streaming RPCs,” wrote Espeholt.

SEED RL is based on the TensorFlow 2 API and accelerated by tensor processing units (TPUs) in the benchmarks released by Google. TPUs are AI application-specific integrated circuit (ASIC) developed by Google for neural network machine learning.

Photo credit: iStockphoto/torwai