Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Running Apache Hadoop on Google Cloud Platform
Tuesday, September 10, 2013
Hadoop
is a common solution for companies and research organizations taking advantage of the Big Data revolution. Many find that as their data grows, their IT infrastructure and budget cannot keep up with the storage and computational demands required.
Google Cloud Platform
provides a compelling alternative to purchasing and managing more servers and storage devices by providing consistent high performance virtual machines that you can pay for by the minute. To help you get started we are releasing two solution papers and two sample applications to get you up and running with Hadoop on the Google Cloud Platform. Learn more on the
solution page
.
Apache Hadoop, Hive, and Pig on Google Compute Engine
Have you heard about Hadoop, MapReduce, Hive, or Pig, but aren’t sure why you would use them? Or are you already running Hadoop and related tools on-premise and want to know what it will look like on Google Compute Engine? Read the
solution paper
, a complete guide to help you.
Managing Hadoop Clusters on Google Compute Engine
Are you already running a long-lived, mission critical Hadoop cluster on Google Compute Engine and looking for management advice? Read the
solution paper
for a comprehensive review on how to manage them.
Get the code and get going
Launching and managing multiple machine instances, setting up users, assigning appropriate permissions, and installing and configuring software can be a minefield that lies between you and productivity. Our sample applications get you up and running quickly.
Download or fork the GitHub projects:
Apache Hadoop on Google Compute Engine
Apache Hive and Pig on Google Compute Engine
- Posted by Matt Bookman, Solutions Architect
“Apache," "Apache Hadoop," "Hadoop," "Apache Hive," "Hive," "Apache Pig," and "Pig" are trademarks of the Apache Software Foundation.
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Firebase Blog
Apigee Blog
Popular Posts
Understanding Cloud Pricing
World's largest event dataset now publicly available in BigQuery
A look inside Google’s Data Center Networks
Enter the Andromeda zone - Google Cloud Platform’s latest networking stack
New in Google Cloud Storage: auto-delete, regional buckets and faster uploads
Labels
Announcements
193
Big Data & Machine Learning
134
Compute
271
Containers & Kubernetes
92
CRE
27
Customers
107
Developer Tools & Insights
151
Events
38
Infrastructure
44
Management Tools
87
Networking
43
Open
1
Open Source
135
Partners
102
Pricing
28
Security & Identity
85
Solutions
24
Stackdriver
24
Storage & Databases
164
Weekly Roundups
20
Feed
Subscribe by email
Demonstrate your proficiency to design, build and manage solutions on Google Cloud Platform.
Learn More
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow