Google Cloud Platform Blog: May 2011

App Engine at I/O 2011, Day 2

Wednesday, May 11, 2011

Good morning I/O-ers and welcome to day two! We hope you got a chance to see all of the great sessions on day one and, if you missed our big announcements from yesterday, take a look at all the great features in 1.5.0 including Backends and our plans for App Engine to leave preview later this year.We’ve got a great set of sessions lined up for day two including updates on our progress with Full-text Search and MapReduce. We’ve also got two great sessions on a subject close to developers’ hearts: reliability.Under the Covers with the High Replication Datastore - At 10:45 am, we jump into the internals of the High Replication Datastore (HRD), explaining how it works, how it differs from the Master/Slave configuration, and why developers love it. Now that HRD is the default configuration and cheaper to use, come find out what you’re missing.Life in App Engine Production - At 3:00 pm, come meet the team wearing the pagers so you don’t have to, App Engine’s Site Reliability Engineers (SREs). In this session, they’ll give you a view of what life behind the scenes is like and why you should concentrate on your application and let the SREs take care of keeping the lights on. Learn how building on App Engine means your application gets its very own Devops team.For those of you that couldn't make it to I/O this year, don’t stress. While we wish you were here, the I/O video team will soon have videos of all our sessions available so you can catch up from the comfort of your own home. We’ve even captured a few of our Developer Sandbox companies so you’ll get the full experience!Posted by the App Engine team

The Year Ahead for Google App Engine!

Tuesday, May 10, 2011

Google App Enginelaunched in Preview status in 2008Google App Engine for BusinessGoogle App Engine for Business (GAE4B) was announced at Google I/O last year

Our customers are really excited about many of the features of GAE4B: SLA, SSL for custom domains, SQL, etc

The per-user, per-app pricing is not appropriate for many customers that are excited about the GAE4B features because they are not always building applications focused on users within their organization.

Because App Engine is currently a Preview product, some businesses are reluctant to heavily invest in the platform.

App Engine graduating from Preview later this year!

We will be leaving Preview to become an official Google product! This is the top priority for our team and we are on schedule to graduate in the second half of this year. This does not mean we’ve stopped development of other much needed features, for example today we are releasing Backends.

Most of the GAE4B features are going to be rolled into App Engine, enabling all paying customers to take advantage of them. This includes: SLA, SSL for custom domains, SQL, Operational and Developer support, and the new business-oriented Terms of Service. The domain console is currently in trusted tester and will remain so for the time being. GAE4B will no longer be offered as a standalone product.

In order to become an official Google product we must restructure our pricing model to obtain sustainable revenue. Based on customer feedback this means focusing on usage-based pricing and placing per-user, per-app pricing on hold until further notice.

99.95% uptime service level agreementnew Terms of Service agreementNew App Engine Pricing Modeldetails of the new pricing model

Free Apps: Similar to the Free Apps of today but with more restrictive quotas

Paid Apps: Similar to Apps which have Billing Enabled today, except they will have a 99.95% SLA and cost $9/app/month in addition to usage fees. Customers will need to opt-in to this.

Premier Accounts: A new offering which will allow a company to not pay for individual Apps but rather use as many as they need. Premier Accounts will be eligible to receive Operational Support for $500/month in addition to usage fees.

Instances: Due to customer feedback and to better service memory intensive applications, we will be eliminating CPU hours. Instead, our serving infrastructure will charge for the number of Instances running as a new unit called “Instance-hours (IH)” (1 instance running for 1 hour). These instances will be similar to the instances you can see in the Admin Console today with the exception that we will be improving our scheduler to ensure each instance has an appropriate level of utilization. IHs can either be pay-as-you-go ($0.08/IH) or you can commit to a minimum number of IHs over the course of a week and pay less for those reserved IHs ($0.05/IH).

APIs: All APIs which are currently charged as CPU hours will instead be charged per operation. This is intended to make it easier to predict and understand the cost of an application. Our website contains the details of the API pricing.

Datastore Storage: High Replication Datastore (HRD) storage is being reduced in price from $0.45/G/month to $0.24/G/month as of today. HRD has delivered essentially 99.999% uptime since we launched it in January. As of today HRD is the default option for new applications, and we strongly encourage users of the M/S Datastore to migrate to HRD. Master/Slave Datastore storage will increase to $0.24/G/month when App Engine leaves Preview. We are actively working on better tools to make this migration easier.

Thank you!The App Engine Team

App Engine 1.5.0 Release

Tuesday, May 10, 2011

The App Engine team has been working furiously in preparation for Google I/O time and today, we are excited to announce the release of App Engine 1.5.0, complete with a bunch of new features. This release brings a whole new dimension to App Engine Applications with the introduction of Backends, some big improvements to Task Queues, a completely new, experimental runtime for the Go language, High Replication Datastore as the new default configuration (and a lower price!), and even more tweaks and bug fixes.Serving Changes

Backends: Until now all App Engine applications have been running on short-lived dynamic instances that we spin up and down in response to application requests. This is great for building scalable web applications, but has a number of limitations if you are looking to build larger, long-lived, and/or memory intensive infrastructure. With 1.5.0, we are introducing Backends which will allow developers to do precisely this! Backends are developer-controlled, long-running, addressable sets of instances which are as large as the developer specifies. There are no request deadlines, they can be started and stopped (or dynamically start when called), they can use between 128M and 1G of memory and proportional CPU. If you’d like to find out more, have a read through our Backend docs for Java and Python.

Pull Queues: Most of our users are heavily using Task Queues in their applications today, but there is lots of room for more flexibility. With 1.5.0 we are introducing Pull Queues to allow developers to “pull” tasks from a queue as applications are ready to process them, rather than relying on Task Queues to push tasks at a pre-configured rate. This means you can write a Backend to do some background processing and pull 1, 10, or 100s of tasks off the Pull Queue when the Backend is ready for more. In addition, we’ve introduced a REST API which will allow external services to do the same thing. For example, if you have an external server running to do image conversion or OCR, you can now use the REST API to pull tasks off, run them, and return the results. In conjunction with these 2 improvements, we’ve also increased the payload limits and processing rate. We are excited both about expanding the use of Task Queues as well as improving the ease of integration between App Engine and the rest of the cloud.

Datastore

High Replication Datastore as default: After months of usage and feedback on the High Replication datastore (as well as a record of 99.999% uptime so far) we are now confident that it is the right path forward for the majority of our users. So, today we are doing two things: setting HRD as the default for all new apps created, lowering the price of HRD storage from $0.45 down to $0.24, and encouraging everybody to begin plans to migrate. We really appreciate all the time that early users of HRD put into trying it out and finding issues and have fixed a number of those issues with this release.

Changed APIs

In response to popular demand, the HTTP request and response sizes have been increased to 32 MB.

Mail API: We have added a few restrictions to the Mail API to improve the reliability and reputation of the service for all applications. First, emails must be sent from email accounts managed by Google (either Gmail, or a domain signed up for Google Apps). Second, we’ve reduced the number of free recipients per day from 2000 to 100 for newly created applications. Both of these will help ensure mail from your application arrives at the destination reliably.

Administration

Code downloads: As of 1.5.0, we have expanded the ability to download an Application’s source code to include both the user who uploaded the code to download it as well as the Owner(s) of the project as listed in the Admin Console. Owners were introduced in 1.4.2 as an admin role.

New runtime: With 1.5.0 we are launching an experimental runtime for the Go Programming Language. Go is an open source, statically typed, compiled language with a dynamic and lightweight feel. It’s also an interesting new option for App Engine because Go apps will be compiled to native code, making Go a good choice for more CPU-intensive tasks. As of today, the App Engine SDK for Go is available for download, and we will soon enable deployment of Go apps into the App Engine infrastructure. If you’re interested in starting early, sign up to be first through the door when we open it up to early testers. See the announcement on the Go Blog for more details.

JavaPythonother announcement at I/O 2011The App Engine Team

Who's at Google I/O: Evite

Monday, May 9, 2011

Our initial motivation to use Google App Engine was fast provisioning. We introduced App Engine as a temporary solution while we increased scale and optimized our proprietary data store. However, once we imported user profile data and put App Engine backend services in production, we never looked back.Importing a large data set of user profiles from Oracle RAC into App Engine was challenging at first because we had to perform a bulk import and then keep the data synchronized between the two datastores. As we developed our data synchronization tools we gained better understanding of the API and performance characteristics of the App Engine datastore.Once we synchronized profile data and enabled production traffic, App Engine really started to impress. We would watch our daily traffic grow and observe App Engine’s automatic scaling in action. Additional server instances would come online to meet increased demand and disappear as traffic lowered, without any sysadmin intervention. Despite our proficiency in the use of cloud computing resources and system automation, it was never this easy to provision new servers. As Grig, Evite’s devops guru, likes to say "It's not in production until it's monitored and graphed." App Engine’s dashboard automatically manages instances and graphs system usage data.

Following our positive experience with Evite profiles, we have continued to use App Engine for other services. Occasional frustrations with elevated error rates on the Master/Slave Datastore disappeared as we switched to the High Replication Datastore. At this point we see no technical obstacles to using App Engine more extensively. Architectural decisions and best practices implemented by the App Engine team are very well aligned with the choices we made in designing Evite’s new platform. This makes it easy to deploy additional services and data sets to App Engine.The real obstacle to using App Engine exclusively for most Evite services is risk management. As reliable and cost-effective as App Engine has been for us, it is difficult to depend completely on a single vendor/service provider. To ensure availability of our application we use multiple cloud providers. Unfortunately, this strategy prevents us from using certain App Engine features and API's because we do not have adequate replacements for them in other deployment environments.Evite’s new platform built on Python, NoSQL and App Engine has been a success. Our new web application has been well received by users, and our first iPhone app is receiving positive reviews. (Android version is in the works). We look forward to continued use of App Engine for data warehousing, rapid launches of new services and other projects.Come see Evite in the Developer Sandbox at Google I/O on May 10-11.Dan Mesh is a Vice President of Engineering at Evite. His extended responsibilities include espresso deliveries and release day food orders.Posted by Scott Knaster, Editor

Accessing Gmail accounts from App Engine with Context.IO

Monday, May 9, 2011

This is part of our on going series of blog posts from guest authors highlighting success stories from applications and services built on or targeting App Engine developers. Today, we have a post from Bruno Morency of Context.IO. Bruno has been involved in startups since graduating from McGill Engineering in 2001. And since being introduced to Pine on UNIX terminals, he’s had a love-hate relationship with email.Email mailboxes contain years of important conversations and business information yet there are no easy ways for App Engine developers to find and use that information. This is what Context.IO does. It’s the missing email API that turns mailboxes into a data source developers can leverage. This is particularly interesting for Google App Engine developers since it makes content of Gmail accounts (and any other IMAP accessible email account) available through a set of HTTP calls.In this post, we’re going to walk you through using Context.IO to build an app that lets users search for contacts from their inbox then easily get the history of recent emails and attachments exchanged with these contacts. A working demo is available at http://contextio-demo.appspot.com. The code for this application is available on Context.IO’s GitHub account.Querying the mailboxOur demo is quite simple in its functionality: there’s a search box used to find contacts, and once a contact has been found, we list recent emails and attachments associated with the contact. To do this, the application offers 3 urls that are called by the JavaScript running in the browser to obtain the data: search.json, messages.json and files.json.The actual UI formatting is all implemented in the JavaScript running in the browser, but our focus here is how we get that data out of the mailbox and return it to the browser.Let’s see how we respond to the request to get message history for a given contact. This is done by calling /messages.json which accepts an email address as a GET parameter. Note, this functionality requires an authentication step not shown here. The code behind that call is as follows:class MessagesHandler(webapp.RequestHandler): def get(self): current_user = users.get_current_user() current_email = current_user.email() emailAddr = self.request.get('email') contextIO = ContextIO(api_key=settings.CONTEXTIO_OAUTH_KEY, api_secret=settings.CONTEXTIO_OAUTH_SECRET, api_url=settings.CONTEXTIO_API_URL) response = contextIO.contactmessages(emailAddr,account=current_email) self.response.out.write(simplejson.dumps(response.get_data())) The code simply uses the contactmessages.json API call of and returns all the messages including the subject, other recipients, thread ID, and even attachments in JSON format. Check the documentation for a complete breakdown of the response.You can see how we handle the other cases similarly in handlers.py. To get a complete overview of other calls offered by Context.IO, please refer to the documentation available here http://context.io/docs/latest.Building your own email based applicationThe complete code for this demo application has been made available by the Context.IO team on our GitHub account. Feel free to use it as the base for your own application. To get started, you’ll need to setup your application with your own Google API authorization and Context.IO API keys.If you have questions about using Context.IO to embed your user’s email in your app, feel free to contact us through http://support.context.io.Posted by Bruno Morency, Co-founder of Context.IO

Royal Wedding Bells in the Cloud

Friday, May 6, 2011

As the centuries pass, waves of change wash over us - the invention of the steam engine, photography, space travel, the Internet. We may think these inventions change everything. But some things stay the same. Like our world-wide romantic fascination with a Princess finding her Prince, and their First Royal Kiss.

And so it was only fitting that the highest traffic levels ever on Google App Engine occurred at the moment of the first (and second!) kiss between The Duke and Duchess of Cambridge. The Royal Household announced in March, that the Royal Wedding would have its own official web site, and it would be hosted on Google App Engine. The site was a great team effort between The Royal Household, Accenture, who built the site, the web design agency Reading Room who led on the design and creative work, and the Google App Engine team. The Official Royal Wedding web site generated over 2,000 requests per second, serving 15 million page views from 5.6 million visitors on the day of the wedding. Despite this load, the web site ran smoothly, serving off Google’s shared platform without disturbing the other 200,000+ apps, which collectively serve over 1.5 billion views per day. However, from a load perspective, the really busy Google App Engine app was the YouTube Royal Channel site. This app's traffic peaked at 32,000 requests per second (!) - with an additional 10,000 requests per second during the 10 seconds around the kiss(es). Whilst others were watching for a glimpse of The Dress, we were watching for a cloud platform non-event: the Google App Engine platform continued to operate normally. Both our wishes came true. If you’re planning an event of similar scale yourself, we’d love to hear about it. Our congratulations to the Royal Couple, and best wishes for the future! Peter S Magnusson
Engineering Director
On behalf of the App Engine and Google team

Who's at Google I/O: Elastic Path

Thursday, May 5, 2011

As the diagram above shows, our goal is to have Elastic Path running entirely on the App Engine cloud. The storefronts have already been migrated, and the database and remaining parts of the Elastic Path platform will be fully on the cloud soon. Why are we doing this? There are many benefits to being on App Engine:

Increased security

Easier deployments and operations

Scalability

Cost-effectiveness

Built-in monitoring

Our migration’s high-level approach was to move everything except the persistence layer onto App Engine, and then resolve issues with the technical limitations such as the class whitelist and request length. We also had to modify some third-party libraries to work around App Engine’s restrictions on operations such as class loading, threads, and sockets.We didn’t migrate the persistence layer because Elastic Path uses a relational database; converting our entire object graph to the Datastore is not feasible now. We are working closely with Google on alternatives. In the interim, we are still using a MySQL database and have kept our persistence layer running within a Tomcat application in the colo. We implemented a creative solution: the non-persistence layers of Elastic Path run on App Engine and communicate with the Tomcat-hosted persistence services via Spring Remoting. The back-and-forth remoting was expensive and impacted the performance of our application so we implemented some data caching. For this, we turned to App Engine’s Memcache, which improved performance by an order of magnitude (less than 2 seconds average response times vs. 2 minutes or more without Memcache).Other App Engine technologies we use heavily include AppStats for performance tuning, URL Fetch for the Spring Remoting described above, and the fantastic Maven GAE plugin that we use for packaging and automated deployments. As we continue to push our platform up to the cloud, we hope to utilize more of App Engine’s cool features. If you’d like to learn more about Elastic Path, how we are migrating our Java platform to run on the cloud, and how you might be able to migrate your application to App Engine, drop by our booth in the App Engine section of the Developer Sandbox. See you there!Come see Elastic Path Software in the Developer Sandbox at Google I/O on May 10-11.Eddie Chan is an ecommerce developer at Elastic Path Software in beautiful Vancouver, Canada. He and his brilliant team work closely with Google and are currently focused on migrating existing online stores to App Engine.Posted by Scott Knaster, Editor

Who's at Google I/O: Mojo Helpdesk

Thursday, May 5, 2011

This post is part of Who's at Google I/O, a series of guest blog posts written by developers who are appearing in the Developer Sandbox at Google I/O. It's also cross-posted to the Google Code blog which has similar posts for all sorts of Google developer products.Mojo Helpdesk from Metadot is an RDBMS-based Rails application for ticket tracking and management that can handle millions of tickets. We are migrating this application to run on Google App Engine (GAE), Java, and Google Web Toolkit (GWT). We were motivated to make this move because of the application’s need for scalability in data management and request handling, the benefits from access to GAE’s services and administrative tools, and GWT’s support for easy development of a rich application front-end.In this post, we focus on GAE and share some techniques that have been useful in the migration process.Task failure managementOur application makes heavy use of the Task Queue service, and must detect and manage tasks that are being retried multiple times but aren’t succeeding. To do this, we extended Deferred, which allows easy task definition and deployment. We defined a new Task abstraction, which implements an extended Deferrable and requires that every Task implement an onFailure method. Our extension of Deferred then terminates a Task permanently if it exceeds a threshold on retries, and calls its onFailure method.This allows permanent task failure to be reliably exposed as an application-level event, and handled appropriately. (Similar techniques could be used to extend the new official Deferred API).

From the existing Mojo Helpdesk: a view of a user’s assigned tickets.

Appengine-mapreduceMojo Helpdesk needs to run many types of batch jobs, and appengine-mapreduce is of great utility. However, we often want to map over a filtered subset of Datastore entities, and our map implementations are JDO-based (to enforce consistent application semantics), so we don’t need low-level Entities prefetched.  So, we made two extensions to the mapper libraries. First, we support the specification of filters on the mapper’s Datastore sharding and fetch queries, so that a job need not iterate over all the entities of a Kind. Second, our mapper fetch does a keys-only Datastore query; only the keys are provided to the map method, then the full data objects are obtained via JDO. These changes let us run large JDO-based mapreduce jobs with much greater efficiency.Supporting transaction semanticsThe Datastore supports transactions only on entities in the same entity group. Often, operations on multiple entities must be performed atomically, but grouping is infeasible due to the contention that would result. We make heavy use of transactional tasks to circumvent this restriction. (If a task is launched within a transaction, it will be run if and only if the transaction commits). A group of activities performed in this manner – the initiating method and its transactional tasks – can be viewed as a “transactional unit” with shared semantics.We have made this concept explicit by creating a framework to support definition, invocation, and automatic logging of transactional units. (The Task abstraction above is used to identify cases where a transactional task does not succeed). All Datastore-related application actions – both in RPC methods and "offline" activities like mapreduce – use this framework. This approach has helped to make our application robust, by enforcing application-wide consistency in transaction semantics, and in the process, standardizing the events and logging which feed the app’s workflow systems.

From the existing Mojo Helpdesk: a view of the unassigned tickets for a work group.

Entity DesignTo support join-like functionality, we can exploit multi-valued Entity properties (list properties) and the query support they provide. For example, a Ticket includes a list of associated Tag IDs, and Tag objects include a list of Ticket IDs they’re used with. This lets us very efficiently fetch, for example, all Tickets tagged with a conjunction of keywords, or any Tags that a set of tickets has in common. (We have found the use of "index entities" to be effective in this context). We also store derived counts and categorizations in order to sidestep Datastore restrictions on query formulation.These patterns have helped us build an app whose components run efficiently and robustly, interacting in a loosely coupled manner.Come see Mojo Helpdesk in the Developer Sandbox at Google I/O on May 10-11.Amy (@amygdala) has recently co-authored (with Daniel Guermeur) a book on Google App Engine and GWT application development. She has worked at several startups, in academia, and in industrial R&D labs; consults and does technical training and course development in web technologies; and is a contributor to the @thinkupapp open source project.Posted by Scott Knaster, Editor