Google Cloud Platform Blog: Developer Insights: Teaching thousands of students to program on Udacity with App Engine (part 2)

Google Cloud Platform Blog

Product updates, customer stories, and tips and tricks on Google Cloud Platform

Developer Insights: Teaching thousands of students to program on Udacity with App Engine (part 2)

Wednesday, October 31, 2012

This post is the second of our two-part series discussing how Udacity uses Google App Engine.

Today’s guest blogger is Chris Chew, senior software engineer at Udacity, which offers free online courses in programming and other subjects. Chris shares how Udacity itself is built using App Engine.

Steve Huffman blogged yesterday about how App Engine enables the project-based learning that makes his web development course so powerful. People are often surprised to learn that Udacity itself is built on App Engine.

The choice to use App Engine originally came from Mike Sokolsky, our CTO and cofounder, after his experience keeping the original version of our extremely popular AI course running on a series of virtual machines. Mike found App Engine’s operational simplicity extremely compelling after weeks of endlessly spinning up additional servers and administering MySQL replication in order to meet the crazy scale patterns we experience.

Close to a year later, with ten months of live traffic on App Engine, we continue to be satisfied customers. While there are a few things we do outside App Engine, our choice to continue using App Engine for our core application is clear: We prefer to spend our time figuring out how to scale personalized education, not memcached. App Engine’s infrastructure is better than what we could build ourselves, and it frees us to focus on behavior rather than operations.

How Udacity Uses App Engine

The App Engine features we use most include a pretty broad swath of the platform:

High Replication Datastore with NDB

Memcache

Task Queues - Deferred execution, MapReduce, batch jobs

App Engine Search API -- Indexing both course content and student résumés

Blobstore API -- Lecture videos, résumés, data exportation

Image API - Thumbnail generation

MapReduce API - Daily usage analytics, data migrations, data maintenance

A high-level representation of our “stack” looks something like this:

Trails and Trove are two libraries developed in-house mainly by Piotr Kaminski. Trails supplies very clean semantics for creating families of RESTful endpoints on top of a webapp2.RequestHandler with automagic marshalling. Trove is a wrapper around NDB that adds common property types (e.g. efficient dereferencing of key properties), yet another layer of caching for entities with relations (both in-process and memcache), and an event “watcher” framework for reliably triggering out-of-band processing when data changes.

Something notable that is not represented in the drawing above is a specific set of monkey patches from Trove we apply to NDB to create better hooks similar to the existing pre/post-put/delete hooks. These custom hooks power a “watcher” abstraction that provides targeted pieces of code the opportunity to react to changes in the data layer. Execution of each watcher is deferred and runs outside the scope of the request so as to not increase response times.

Latency

During our first year of scaling on App Engine we learned its performance is a complex thing to understand. Response time is a function of several factors both inside and outside our control. App Engine’s ability to “scale-out” is undeniable, but we have observed high variance in response times for a given request, even during periods with low load on the system. As a consequence we have learned to do a number of things to minimize the impact of latency variance:

Converting usage of the old datastore API to the new NDB API

Using NDB.tasklet coroutines as much as possible to enable parallelism during blocking RPC operations

Not indexing fields by default and adding an index only when we need it for a query

Carefully avoiding index hotspots by indexing fields with predictable values only when necessary (i.e. auto-now DateTime and enumerated “choices” String properties).

Materializing data views very aggressively so we can limit each request to the fewest datastore queries possible

This last point is obvious in the sense that naturally you get faster responses when you do less work. But we have taken pre-materializing views to an extreme level by denormalizing several aspects of our domain into read-optimized records. For example, the read-optimized version of a user’s profile record might contain standard profile information, plus privacy configuration, course enrollment information, course progress, and permissions -- all things a data modeler would normally want to store separately. We pile it together into the equivalent of a materialized view so we can fetch it all in one query.

Conclusion

App Engine is an amazingly complete and reliable platform that works astonishingly well for a huge number of use cases. It is very apparent the services and APIs have been designed by people who know how to scale web applications, and we feel lucky to have the opportunity to ride on the shoulders of such giants. It is trivial to whip together a proof-of-concept for almost any idea, and the subsequent work to scale your app is significantly less than if you had rolled your own infrastructure.

As with any platform, there are tradeoffs. The tradeoff with App Engine is that you get an amazing suite of scale-ready services at the cost of relentlessly optimizing to minimize latency spikes. This is an easy tradeoff for us because App Engine has served us well through several exciting usage spikes and there is no question the progress we have already made towards our mission is significantly more than if we were also building our own infrastructure. Like most choices in life, this choice can be boiled down to a bumper sticker:

Editor’s note: Chris Chew and Steve Huffman will be participating in a Google Developers Live Hangout tomorrow, Thursday, November 1st, check it out here and submit your questions for them to answer live on air.

-Contributed by Chris Chew, Senior Software Engineer, Udacity

Posted by Zafir Khan, Product Marketing Manager, Google App Engine

Labels: Developer Tools & Insights

Free Trial

GCP Blogs

Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Firebase Blog
Apigee Blog

Popular Posts

dotCloud provides faster, more reliable PaaS with Google Cloud Platform
A look inside Google’s Data Center Networks
World's largest event dataset now publicly available in BigQuery
Introducing Google Cloud Storage Nearline: (near)online data at an offline price
Google Compute Engine is now Generally Available with expanded OS support, transparent maintenance, and lower prices

Labels

Announcements 193
Big Data & Machine Learning 134
Compute 271
Containers & Kubernetes 92
CRE 27
Customers 107
Developer Tools & Insights 151
Events 38
Infrastructure 44
Management Tools 87
Networking 43
Open 1
Open Source 135
Partners 102
Pricing 28
Security & Identity 85
Solutions 24
Stackdriver 24
Storage & Databases 164
Weekly Roundups 20

Feed

Subscribe by email

Demonstrate your proficiency to design, build and manage solutions on Google Cloud Platform.

Technical questions? Check us out on Stack Overflow.
Subscribe to our monthly newsletter.

Company-wide

Official Google Blog
Enterprise Blog
Student Blog

Products

Official Android Blog
Chrome Blog
Lat Long Blog

Developers

Ads Developer Blog
Android Developers Blog
Developers Blog

Google
Privacy
Terms