Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Cloud TPU now offers preemptible pricing and global availability
Tuesday, June 19, 2018
By Brennan Saeta, TensorFlow Tech Lead for Cloud TPUs
Deep neural networks have enabled breakthroughs across a variety of business and research challenges, including translating text between languages, transcribing speech, classifying image content, and mastering the game of Go. Because training and running deep learning models can be extremely computationally demanding, we rely on our custom-built Tensor Processing Units (TPUs) to power several of our major products, including
Translate
,
Photos
,
Search
,
Assistant
, and
Gmail
.
Cloud TPUs
allow businesses everywhere to transform their own products and services with machine learning, and we’re working hard to make Cloud TPUs as widely available and as affordable as possible. As of today, Cloud TPUs are available in two new regions in Europe and Asia, and we are also introducing preemptible pricing for Cloud TPUs that is 70% lower than the normal price.
Cloud TPUs are available in the United States, Europe, and Asia at the following rates, and you can get started in minutes via our
Quickstart
guide:
One
Cloud TPU
(v2-8) can deliver up to 180 teraflops and includes 64 GB of high-bandwidth memory. The colorful cables link multiple TPU devices together over a custom 2-D mesh network to form Cloud TPU Pods. These accelerators are programmed via
TensorFlow
and are widely available today on
Google Cloud Platform
.
Benchmarking Cloud TPU performance-per-dollar
Training a machine learning model is analogous to compiling code: ML training needs to happen fast for engineers, researchers, and data scientists to be productive, and ML training needs to be affordable for models to be trained over and over as a production application is built, deployed, and refined. Key metrics include time-to-accuracy and training cost.
Researchers at Stanford recently hosted an open benchmarking competition called
DAWNBench
that focused on time-to-accuracy and training cost, and Cloud TPUs won first place in the large-scale
ImageNet Training Cost
category. On a single Cloud TPU, our open-source
AmoebaNet
reference model cost only $49.30 to reach the target accuracy, and our open-source
ResNet-50
model cost just $58.53. Our TPU Pods also won the
ImageNet Training Time
category: the same ResNet-50 code running on just half of a TPU pod was nearly
six times faster
than any non-TPU submission, reaching the target accuracy in approximately 30 minutes!
Although we restricted ourselves to standard algorithms and standard learning regimes for the competition, another DAWNBench submission from fast.ai (3rd place in ImageNet Training Cost, 4th place in ImageNet Training Time) altered the standard ResNet-50 training procedure in two clever ways to achieve faster convergence (GPU implementation
here
). After DAWNBench was over, we easily applied the same optimizations to our
Cloud TPU ResNet-50 implementation
. This reduced ResNet-50 training time on a single Cloud TPU from 8.9 hours to 3.5 hours, a 2.5X improvement, which made it possible to train ResNet-50 for just $25 with normal pricing.
Preemptible Cloud TPUs make the Cloud TPU platform even more affordable.
You can now train ResNet-50 on ImageNet from scratch for just $7.50.
Preemptible Cloud TPUs allow fault-tolerant workloads to run more cost-effectively than ever before; these TPUs behave similarly to
Preemptible VMs
. And because TensorFlow has built-in support for
saving and restoring from checkpoints
, deadline-insensitive workloads can easily take advantage of preemptible pricing. This means you can train cutting-edge deep learning models to achieve DAWNBench-level accuracy for less than you might pay for lunch!
Select Open-Source Reference Models
Normal training cost
(TF 1.8)
Preemptible training cost
(TF 1.8)
ResNet-50
(with optimizations from fast.ai): Image classification
~$25
~$7.50
ResNet-50
(original implementation): Image classification
~$59
~$18
AmoebaNet
: Image classification (model architecture evolved from scratch on TPUs to maximize accuracy)
~$49
~$15
RetinaNet
: Object detection
~$40
~$12
Transformer
: Neural machine translation
~$41
~$13
ASR Transformer
: Speech recognition (transcribe speech to text)
~$86
~$27
Start using Cloud TPUs today
We aim for Google Cloud to be the best place to run all of your machine learning workloads. Cloud TPUs offer great performance-per-dollar for training and batch inference across a variety of machine learning applications, and we also offer top-of-the-line GPUs with r
ecently-improved preemptible pricing
.
We’re excited to see what you build! To get started, please check out the
Cloud TPU Quickstart
, try our
open source reference models
, and be sure to sign up for a
free trial
to start with $300 in cloud credits. Finally, we encourage you to watch our Cloud-TPU-related sessions from Google I/O and the TensorFlow Dev Summit: “
Effective machine learning with Cloud TPUs
” and “
Training Performance: A user’s guide to converge faster.
”
A datacenter technician scoots past two rows of Cloud TPUs and supporting equipment.
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Firebase Blog
Apigee Blog
Popular Posts
Understanding Cloud Pricing
World's largest event dataset now publicly available in BigQuery
A look inside Google’s Data Center Networks
Enter the Andromeda zone - Google Cloud Platform’s latest networking stack
New in Google Cloud Storage: auto-delete, regional buckets and faster uploads
Labels
Announcements
193
Big Data & Machine Learning
134
Compute
271
Containers & Kubernetes
92
CRE
27
Customers
107
Developer Tools & Insights
151
Events
38
Infrastructure
44
Management Tools
87
Networking
43
Open
1
Open Source
135
Partners
102
Pricing
28
Security & Identity
85
Solutions
24
Stackdriver
24
Storage & Databases
164
Weekly Roundups
20
Feed
Subscribe by email
Demonstrate your proficiency to design, build and manage solutions on Google Cloud Platform.
Learn More
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow