Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Quantifying the performance of the TPU, our first machine learning chip
Wednesday, April 5, 2017
By Norm Jouppi, Distinguished Hardware Engineer, Google
Editor's Note
: Learn about our newly available
Cloud TPUs
.
We’ve been using compute-intensive machine learning in our products for the past 15 years. We use it so much that we even designed an entirely new class of custom machine learning accelerator, the
Tensor Processing Unit
.
Just how fast is the TPU, actually? Today, in conjunction with a
TPU talk for a National Academy of Engineering meeting at the Computer History Museum in Silicon Valley
, we’re releasing
a study
that shares new details on these custom chips, which have been running machine learning applications in our data centers since 2015. This first generation of TPUs targeted inference (the use of an already trained model, as opposed to the training phase of a model, which has somewhat different characteristics), and here are some of the results we’ve seen:
On our production AI workloads that utilize neural network inference, the TPU is 15x to 30x faster than contemporary GPUs and CPUs.
The TPU also achieves much better energy efficiency than conventional chips, achieving 30x to 80x improvement in TOPS/Watt measure (tera-operations [trillion or 10
12
operations] of computation per Watt of energy consumed).
The neural networks powering these applications require a surprisingly small amount of code: just 100 to 1500 lines. The code is based on
TensorFlow
, our popular open-source machine learning framework.
More than 70 authors contributed to this report. It really does take a village to design, verify, implement and deploy the hardware and software of a system like this.
The need for TPUs really emerged about six years ago, when we started using computationally expensive deep learning models in more and more places throughout our products. The computational expense of using these models had us worried. If we considered a scenario where people use Google voice search for just three minutes a day and we ran deep neural nets for our speech recognition system on the processing units we were using, we would have had to double the number of Google data centers!
TPUs allow us to make predictions very quickly, and enable products that respond in fractions of a second. TPUs are behind every search query; they power accurate vision models that underlie products like Google Image Search, Google Photos and the Google Cloud Vision API; they underpin the
groundbreaking quality improvements that Google Translate
rolled out last year; and they were instrumental in
Google DeepMind's victory over Lee Sedol
, the first instance of a computer defeating a world champion in the ancient game of Go.
We’re committed to building the best infrastructure and sharing those benefits with everyone. We look forward to sharing more updates in the coming weeks and months.
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Firebase Blog
Apigee Blog
Popular Posts
12 best practices for user account, authorization and password management
Open-sourcing gVisor, a sandboxed container runtime
Cloud TPU machine learning accelerators now available in beta
Introducing Agones: Open-source, multiplayer, dedicated game-server hosting built on Kubernetes
API design: Choosing between names and identifiers in URLs
Labels
Announcements
193
Big Data & Machine Learning
134
Compute
273
Containers & Kubernetes
92
CRE
27
Customers
107
Developer Tools & Insights
154
Events
39
Infrastructure
44
Management Tools
87
Networking
43
Open
1
Open Source
135
Partners
102
Pricing
28
Security & Identity
85
Solutions
24
Stackdriver
24
Storage & Databases
165
Weekly Roundups
20
Feed
Subscribe by email
Demonstrate your proficiency to design, build and manage solutions on Google Cloud Platform.
Learn More
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow