Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Developer Insights: Teaching thousands of students to program on Udacity with App Engine (part 2)
Wednesday, October 31, 2012
This post is the second of our two-part series discussing how Udacity uses Google App Engine.
Today’s guest blogger is Chris Chew, senior software engineer at
Udacity
, which offers free online courses in programming and other subjects. Chris shares how Udacity itself is built using App Engine.
Steve Huffman blogged yesterday about how App Engine enables the project-based learning that makes his
web development course
so powerful. People are often surprised to learn that
Udacit
y
itself is built on App Engine.
The choice to use App Engine originally came from Mike Sokolsky, our CTO and cofounder, after his experience keeping the original version of our extremely popular
AI course
running on a series of virtual machines. Mike found App Engine’s operational simplicity extremely compelling after weeks of endlessly spinning up additional servers and administering MySQL replication in order to meet the crazy scale patterns we experience.
Close to a year later, with ten months of live traffic on App Engine, we continue to be satisfied customers. While there are a few things we do outside App Engine, our choice to continue using App Engine for our core application is clear: We prefer to spend our time figuring out how to scale
personalized education
, not memcached. App Engine’s infrastructure is better than what we could build ourselves, and it frees us to focus on behavior rather than operations.
How Udacity Uses App Engine
The App Engine features we use most include a pretty broad swath of the platform:
High Replication
Datastore
with
NDB
Memcache
Task Queues
-
Deferred execution
, MapReduce, batch jobs
App Engine Search API
-- Indexing both course content and student résumés
Blobstore API
-- Lecture videos, résumés, data exportation
Image API
- Thumbnail generation
MapReduce API
- Daily usage analytics, data migrations, data maintenance
A high-level representation of our “stack” looks something like this:
Trails and Trove are two libraries developed in-house mainly by Piotr Kaminski. Trails supplies very clean semantics for creating families of RESTful endpoints on top of a webapp2.RequestHandler with automagic marshalling. Trove is a wrapper around NDB that adds common property types (e.g. efficient dereferencing of key properties), yet another layer of caching for entities with relations (both in-process and memcache), and an event “watcher” framework for reliably triggering out-of-band processing when data changes.
Something notable that is not represented in the drawing above is a specific set of monkey patches from Trove we apply to NDB to create better hooks similar to the existing pre/post-put/delete hooks. These custom hooks power a “watcher” abstraction that provides targeted pieces of code the opportunity to react to changes in the data layer. Execution of each watcher is deferred and runs outside the scope of the request so as to not increase response times.
Latency
During our first year of scaling on App Engine we learned its performance is a complex thing to understand. Response time is a function of several factors both inside and outside our control. App Engine’s ability to “scale-out” is undeniable, but we have observed high variance in response times for a given request, even during periods with low load on the system. As a consequence we have learned to do a number of things to minimize the impact of latency variance:
Converting usage of the old datastore API to the new
NDB API
Using
NDB.tasklet
coroutines as much as possible to enable parallelism during blocking RPC operations
Not indexing fields by default and adding an index only when we need it for a query
Carefully avoiding index hotspots by indexing fields with predictable values only when necessary (i.e. auto-now DateTime and enumerated “choices” String properties).
Materializing
data views very aggressively so we can limit each request to the fewest datastore queries possible
This last point is obvious in the sense that naturally you get faster responses when you do less work. But we have taken pre-materializing views to an extreme level by denormalizing several aspects of our domain into read-optimized records. For example, the read-optimized version of a user’s profile record might contain standard profile information, plus privacy configuration, course enrollment information, course progress, and permissions -- all things a data modeler would normally want to store separately. We pile it together into the equivalent of a materialized view so we can fetch it all in one query.
Conclusion
App Engine is an amazingly complete and reliable platform that works astonishingly well for a huge number of use cases. It is very apparent the services and APIs have been designed by people who know how to scale web applications, and we feel lucky to have the opportunity to ride on the shoulders of such giants. It is trivial to whip together a proof-of-concept for almost any idea, and the subsequent work to scale your app is significantly less than if you had rolled your own infrastructure.
As with any platform, there are tradeoffs. The tradeoff with App Engine is that you get an amazing suite of scale-ready services at the cost of relentlessly optimizing to minimize latency spikes. This is an easy tradeoff for us because App Engine has served us well through several exciting usage spikes and there is no question the progress we have already made towards our mission is significantly more than if we were also building our own infrastructure. Like most choices in life, this choice can be boiled down to a bumper sticker:
Editor’s note: Chris Chew and Steve Huffman will be participating in a Google Developers Live Hangout tomorrow, Thursday, November 1st, check it out
here
and submit your questions for them to answer live on air.
-Contributed by Chris Chew, Senior Software Engineer, Udacity
Posted by Zafir Khan, Product Marketing Manager, Google App Engine
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Firebase Blog
Apigee Blog
Popular Posts
World's largest event dataset now publicly available in BigQuery
A look inside Google’s Data Center Networks
Enter the Andromeda zone - Google Cloud Platform’s latest networking stack
Using labels to organize Google Cloud Platform resources
New in Google Cloud Storage: auto-delete, regional buckets and faster uploads
Labels
Announcements
193
Big Data & Machine Learning
134
Compute
271
Containers & Kubernetes
92
CRE
27
Customers
107
Developer Tools & Insights
151
Events
38
Infrastructure
44
Management Tools
87
Networking
43
Open
1
Open Source
135
Partners
102
Pricing
28
Security & Identity
85
Solutions
24
Stackdriver
24
Storage & Databases
164
Weekly Roundups
20
Feed
Subscribe by email
Demonstrate your proficiency to design, build and manage solutions on Google Cloud Platform.
Learn More
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow