Google Cloud Platform Blog: November 2011

Scaling with the Kindle Fire

Tuesday, November 29, 2011

Today’s blog post comes to us from Greg Bayer of Pulse, a popular news reading application for iPhone, iPad and Android devices. Pulse has used Google App Engine as a core part of their infrastructure for over a year and they recently celebrated a significant launch. We hope you find their experiences and tips on scaling useful.

As part of the much anticipated Kindle Fire launch, Pulse was announced as one of the only preloaded apps. When you first un-box the Fire, Pulse will be there waiting for you on the home row, next to Facebook and IMDB!

Scale
The Kindle Fire is projected to sell over five million units this quarter alone. This means that those of us who work on backend infrastructure at Pulse have had to prepare for nearly doubling our user-base in a very short period. We also need to be ready for spikes in load due to press events and the holiday season.

Architecture
As I’ve discussed previously on the Pulse Engineering Blog, Pulse’s infrastructure has been designed with scalability in mind from the beginning. We’ve built our web site and client APIs on top of Google App Engine, which has allowed us to grow steadily from 10s to many 1000s of requests per second, without needing to re-architect our systems.

While restrictive in some ways, we’ve found App Engine’s frontend serving instances (running Python in our case) to be extremely scalable, with minimal operational support from our team. We’ve also found the datastore, memcache, and task queue facilities to be equally scalable.

Pulse’s backend infrastructure provides many critical services to our native applications and web site. For example, we cache and serve optimized feed and image data for each source in our catalog. This allows us to minimize latency and data transfer and is especially important to providing an exceptional user experience on limited mobile connections. Providing this service for millions of users requires us to serve 100Ms of requests per day. As with any well designed App Engine app, the vast majority of these requests are served out of memcache and never hit the datastore. Another useful technique we use is to set public cache control headers wherever possible, to allow Google’s edge cache (shown as cached requests on the graph below) and ISP / mobile carrier caches to serve unchanged content directly to users.

Costs
Based on App Engine’s projected billing statements leading up to the recent pricing changes, we were concerned that our costs might increase significantly. To prepare for these changes and the expected additional load from Kindle Fire users, we invested some time in diagnosing and reducing these costs. In most cases, the increases turned out to be an indicator of inefficiencies in our code and/or in the App Engine scheduler. With a little optimization, we have reduced these costs dramatically.

The new tuning sliders for the scheduler make it possible to rein in overly aggressive instance allocation. In the old pricing structure, idle instance time wasn’t charged for at all, so these inefficiencies were usually ignored. Now App Engine charges for all instance time by default. However, any time App Engine runs more idle instances than you’ve allowed, those hours are free. This acts as a hint to the scheduler, helping it reduce unneeded idle instances. By doing some testing to find the optimal cost vs spike latency tolerance and setting the sliders to those levels, we were able to reduce our frontend instance costs to near original levels. Our heavy usage of memcache (which is still free!) also helps keep our instance hours down.

Since datastore operations used to be charged under the umbrella of CPU hours, it was difficult to know the cost of these operations under the old pricing structure. This meant it was easy to miss application inefficiencies, especially for write-heavy workloads where additional indexes can have a multiplicative effect on costs. In our case, the new datastore write operations metric led us to notice some inefficiencies in our design and a tendency to overuse indexes. We are now working to minimize the number of indexes our queries rely on, and this has started to reduce our write costs.

Preparing for the Kindle Fire Launch
We took a few additional steps to prepare for the expected load increase and spikes associated with the Fire’s launch. First, we contacted App Engine’s support team to warn them of the expected increase. This is recommended for any app at or near 10,000 requests per second (to make sure your application is correctly provisioned). We also signed up for a Premier account which gets us additional support and simpler billing.

Architecturally, we decided to split our load across three primary applications, each serving different use cases. While this makes it harder to access data across these applications, those same boundaries serve to isolate potential load-related problems and make tuning simpler. In our case, we were able to divide certain parts of our infrastructure, where cross application data access was less important and load would be significant. Until App Engine provides more visibility into and control of memcache eviction policies, this approach also helps prevent lower priority data from evicting critical data.

I’m hopeful that in the near future such division of services will not be required. Individually tunable load isolation zones and memcache controls would certainly make it a lot more appealing to have everything in a single application. Until then, this technique works quite well, and helps to simplify how we think about scaling.

To learn more about Pulse, check out our website! If you have comments or questions about this post or just want to reach out directly, you can find me @gregbayer.

New Datastore client library for Python ready for a test drive

Wednesday, November 16, 2011

Last week we announced that App Engine has left preview and is now an officially supported product here at Google. And while the release (and the announcement) was chock-full of great features, one of the features that we’d like to call specific attention to is the new Datastore client library for Python (a.k.a “NDB”).

NDB has been under development for some time and this release marks its availability to a larger audience as an experimental feature. Some of the benefits of this new library include:

The StructuredProperty class, which allows entities to have nested structure
Integrated two-level caching, using both memcache and a per-request in-process cache
High-level asynchronous API using Python generators as coroutines (PEP 342)
New, cleaner implementations of Key, Model, Property and Query classes

The version of NDB contained in the 1.6.0 runtime and SDK corresponds to NDB 0.9.1, which is currently the latest NDB release.

Given that this feature is still experimental, it is subject to change, but that’s exactly why we encourage you to give it a test drive and send us any feedback that you might have. The NDB project hosted on Google Code is the best place to send this feedback. Happy coding!

Posted by Guido van Rossum, Software Engineer on the App Engine Team

Google BigQuery Service: Big data analytics at Google speed

Monday, November 14, 2011

Our post today, cross-posted with the Google Enterprise Blog, comes from one of our sister projects, BigQuery. We know that many of you are interested in processing large volumes of data and we encourage you to try it out.
Rapidly crunching terabytes of big data can lead to better business decisions, but this has traditionally required tremendous IT investments. Imagine a large online retailer that wants to provide better product recommendations Google BigQuery ServiceGoogle I/O last year

We’ve added a graphical user interface for analysts and developers to rapidly explore massive data through a web application.

We’ve made big improvements for customers accessing the service programmatically through the API. The new REST API lets you run multiple jobs in the background and manage tables and permissions with more granularity.

Whether you use the BigQuery web application or API, you can now write even more powerful queries with JOIN statements. This lets you run queries across multiple data tables, linked by data that tables have in common.

It’s also now easy to manage, secure, and share access to your data tables in BigQuery, and export query results to the desktop or to Google Cloud Storage.

remarkedlet us know

App Engine 1.6.0 Out of Preview Release

Monday, November 7, 2011

Three and a half years after App Engine’s first Campfire One, App Engine has graduated from Preview and is now a fully supported Google product. We started out with the simple philosophy that App Engine should be ‘easy to use, easy to scale, and free to get started.’ And with 100 billion+ monthly hits, 300,000+ active apps, and 100,000+ developers using our product every month it’s clear that this philosophy resonates. Thanks to your support, Google is making a long term investment in App Engine! When we announced our plans to leave preview earlier this year, we made a commitment to improving the service by adding support for Python 2.7, Premier Accounts and Backends as well as several changes launching today:

Pricing: The new pricing structure announced in May (and updated based on feedback from the community) will be reflected in your bill starting on Nov 7th as previously announced.

Terms of Service: We have a new business-friendly terms of service and acceptable use policy.

Service Level Agreement: All paid applications on the High Replication Datastore are covered by our 99.95% SLA.

We are also holding a series of App Engine Office hours via Google+ this week for any users who have questions about how these changes impact their applications. The list of times can be found on the Google Developers events page, with links to join the hangout while the office hours are scheduled. Also, please don’t hesitate to contact us at appengine_updated_pricing@google.com with any questions or concerns. In addition to leaving Preview, we have several additional changes to announce today.
Production Changes For billing enabled apps, we are offering two more scheduler controls and some additional changes:

Min Idle Instances: You can now adjust the minimum number of Idle Instances for your application, from 1 to 100. Users who had previously signed up for “Always On” can now set the number of idle instances for their applications using this setting.

Max Pending Latency: For applications that care about user facing latency, this slider allows you to set a limit to the amount of time a request spends in the pending queue before starting up a new instance.

Blobstore API: You can now use the Blobstore API without signing up for billing.

Datastore Changes

High Replication Datastore Migration Tool: We are releasing an experimental tool that allows you to easily migrate your data from Master/Slave to High Replication Datastore, and seamlessly switch your application’s serving to the new HRD application.

Query Planning Improvements: We’ve published an article that details recent improvements to our query planner that eliminate the need for exploding indexes.

Python

MapReduce: We are releasing the full MapReduce framework in experimental for Python. The framework includes the Map, Shuffle, and Reduce phases.

Python 2.7 in the SDK: The SDK now supports the Python 2.7 runtime, so you can test out your changes before uploading them to production.

Java™

Memcache API Improvements: The Memcache API for Java now supports asynchronous calls. Additionally, putIfUntouched() and getIdentifiable() now support batch operations.

Capability Testing: We’ve added the ability to simulate the capability state of local API implementations to test your application’s behavior if a service is unavailable.

Datastore Callbacks: You can now specify actions to perform before or after a put() or delete() call.

The full list of changes with this release can be found in the release notes (Python, Java). We’d love to hear your feedback about this release in the groups. And we’d like to thank you all for investing in our platform for the last three years. We’re excited for this milestone in App Engine history, and we look forward to what the future will bring.

Posted by The App Engine Team

Oracle and Java are registered trademarks of Oracle and/or its affiliates.