Google Launches TPU v5p AI Chip
- By Paul Mah
- December 20, 2023
Google has launched a new version of its TPU, or Tensor Processing Units, announcing it alongside its Gemini large language model (LLM) that was announced with great fanfare.
Google’s TPUs are custom application-specific integrated circuits (ASICs) developed in-house for machine learning, and the new Cloud TPU v5p is an updated version of its Cloud TPU v5e that went into general availability earlier this year.
Unlike the “e” version that was optimized for cost efficiency, the new “p” version is optimized for performance.
Fastest AI chip yet
TPU v5p can push 459 teraFLOPS of bfloat16 performance or 918 teraOPS of Int8 – between two to five times that of TPU v4. In addition, a v5p pod consists of a total of 8,960 chips and is backed by Google’s fastest interconnect yet, with up to 4,800 Gpbs of bandwidth per chip.
According to Google, this means the TPU v5p can train a large language model like GPT3-175B 2.8 times faster than the TPU v4.
“In our early-stage usage, Google DeepMind and Google Research have observed 2X speedups for LLM training workloads using TPU v5p chips compared to the performance on our TPU v4 generation,” says Jeff Dean, the chief scientist of Google DeepMind and Google Research.
“The robust support for ML Frameworks (JAX, PyTorch, TensorFlow) and orchestration tools enables us to scale even more efficiently on v5p. With the 2nd generation of SparseCores we also see significant improvement in the performance of embeddings-heavy workloads. TPUs are vital to enabling our largest-scale research and engineering efforts on cutting-edge models like Gemini.”
Google also introduced the concept of the “AI Hypercomputer”, a cloud-based supercomputer architecture that combines performance-optimized hardware, open software, ML frameworks, and flexible consumption models. It uses liquid cooling and Google’s Jupiter data center networking technology.
“Today, with Cloud TPU v5p and AI Hypercomputer, we’re excited to extend the result of decades of research in AI and systems design with our customers, so they can innovate with AI faster, more efficiently, and more cost-effectively,” wrote Google’s Amin Vahdat and Mark Lohmeyer in a blog post.
Long used to power various machine learning features in its own suite of services, Google has started opening its TPUs to the public for AI training and inference tasks. Indeed, customers like Salesforce and Lightricks are currently training and serving large AI models with Google Cloud’s TPU v5p.
Image credit: iStockphoto/selote
Paul Mah
Paul Mah is the editor of DSAITrends, where he report on the latest developments in data science and AI. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose.