Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Building a humanitarian project monitoring tool on App Engine
Thursday, May 30, 2013
Today’s guest post is from Alex Bertram of
Bedatadriven
, who helps clients leverage data and analysis to achieve their goals with software development, consulting and training. In this post, Alex describes why they chose to use Google App Engine and Google Cloud SQL.
One of Bedatadriven’s core projects is
ActivityInfo
, a database platform for humanitarian relief operations and development assistance.
Affected populations plotted by size and type on a base map of Health Zones in Eastern DRC
Originally developed for UNICEF’s emergency program in eastern Congo, today the system is used by over 75 organizations working in Africa and Asia, tracking relief and development activities, across more than 10,000 project sites. With ActivityInfo, project managers can quickly establish an online database that reports the results of educational projects, maps activities that improve water and hygiene, tracks the delivery of equipment to clinics or any other humanitarian activities a project undertakes.
Field offices are able to collect key data about a relief operation’s activities, either through an offline-capable web interface or push results through a RESTful API. These results are then available to managers at a project or programme level and to the Donor organisations that fund the operations and assistance.
Using ActivityInfo:
Less time spent on reporting and collecting data, more on delivering practical aid and support to vulnerable people and communities
Builds a unified view of a humanitarian programme’s progress, across partners, regions and countries
Improves program quality, with faster and more accurate feedback into the project cycle
Choosing our Architecture
Although the code for ActivityInfo is
open sourced
, our vision is to offer the system as a central service to the UN, NGOs and others at ActivityInfo.org, allowing them to focus on delivering the humanitarian programmes to some of the world’s most vulnerable populations. In choosing our infrastructure for ActiviyInfo.org, we had several criteria:
Given the challenging environments that ActivityInfo users work in and the nature of the crises, we needed a platform that could ensure that the system was
highly available
.
Minimal system administration
, allowing bedatadriven’s focus to remain on product development - delivering the tools and functions users need to manage successful relief operations.
A platform that could scale up and down according to the load, with minimal human intervention. The platform had to be
scale automatically
, as during a peak in a humanitarian crisis, when load can increase by an order of magnitude or more.
Clear monitoring tools
to help pinpoint performance problems. Physics imposes a minimum latency of nearly 900 milliseconds per request for satellite connections, so it’s essential for us to keep the server response time as low as possible to ensure a responsive experience for users.
As our user base grew, we moved first from a single machine to another Java PaaS meant to provide dynamic scaling. Unfortunately, we found we were still spending far too much time on server administration, fussing with auto scaling triggers and responding to alerts when the platform failed to scale up the number of application servers sufficiently. Our goal of minimal system administration had been overtaken by the need to keep the system up and running.
Even worse, we were lacking decent monitoring tools to identify and resolve the performance problems. There are some great Open Source tools out there like statsd and graphite, but the investment to get them up and running was more than we wanted to spend.
We had used
Google App Engine
for other projects and were impressed by its simplicity and stability. When the MySQL-based
Google Cloud SQL
service became available, we were quick to make the move.
App Engine has proved to be available and stable. Instances scale up and down with the load appropriately, without having to monkey with configuration or specify triggers through trial and error. New instances come online to serve requests in under 30 seconds, keeping request latency low even when we experience very sudden spikes in utilization.
More importantly, the strong monitoring tools have helped us quickly find and eliminate performance bottlenecks. App Engine collects logs from all running instances in near real time and has a clean interface that allows you to review and search logs, aggregated by request. This allows us to flag all requests that exceed a certain latency and drill down to the causes very quickly.
The App Engine metrics enabled us to pinpoint the MySQL queries that needed tuning, so they no longer tied up threads on the application servers. With a minimal investment of time, we now have ActivityInfo running better than ever before.
App Engine does impose some limitations in exchange for this reliability. Some of these, like the restrictions on the Java imaging libraries, we’ve been able to work around by using pure-Java libraries to render the images and PDF exports for users (See
https://github.com/bedatadriven/appengine-export
).
Others, like the 30-second request limit, have made us true believers. One of our problems turned out to be a few MySQL queries that worked fine in development, but degraded under load, requiring several minutes to complete. When we got hit with a few hundred of these queries concurrently, they quickly tied up all available threads on the application servers and maxed out the connection limits on MySQL, requiring manual intervention to avoid downtime. On App Engine, these cancerous requests were shut down after thirty seconds and flagged in the logs, allowing other requests to complete normally and giving us time to optimize the queries.
Our move to Google App Engine has proven to be a successful one, improving the quality of service to our users and allowing us to focus on software development.
-Contributed by Alexander Betram, Partner, Bedatadriven
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Firebase Blog
Apigee Blog
Popular Posts
World's largest event dataset now publicly available in BigQuery
A look inside Google’s Data Center Networks
Enter the Andromeda zone - Google Cloud Platform’s latest networking stack
Using labels to organize Google Cloud Platform resources
New in Google Cloud Storage: auto-delete, regional buckets and faster uploads
Labels
Announcements
193
Big Data & Machine Learning
134
Compute
271
Containers & Kubernetes
92
CRE
27
Customers
107
Developer Tools & Insights
151
Events
38
Infrastructure
44
Management Tools
87
Networking
43
Open
1
Open Source
135
Partners
102
Pricing
28
Security & Identity
85
Solutions
24
Stackdriver
24
Storage & Databases
164
Weekly Roundups
20
Feed
Subscribe by email
Demonstrate your proficiency to design, build and manage solutions on Google Cloud Platform.
Learn More
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow