Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
How to Troubleshoot Latency in Your App Engine Application
Monday, August 3, 2015
Customers occasionally contact Google Cloud Platform Support to ask for help with troubleshooting latency issues in a Google App Engine application. In this post, I'll discuss how I typically isolate the root cause of this type of problem.
I start by creating a dynamic script that only returns a short text string, and then add it to the customer’s App Engine app so that it can be accessed through a known URL. For an example of such a page in Python, see the
hello world tutorial
.
Then, I run this
curl
command from a terminal window:
curl -s -o /dev/null -w "@
curl-format.txt
"
The
curl
command uses a format file to define its output. Here are contents of the format file. You need to create and save this file as curl-format.txt before you run
curl
:
\n
time_namelookup: %{time_namelookup}\n
time_connect: %{time_connect}\n
time_appconnect: %{time_appconnect}\n
time_pretransfer: %{time_pretransfer}\n
time_redirect: %{time_redirect}\n
time_starttransfer: %{time_starttransfer}\n
--------\n
time_total: %{time_total}\n
\n
The output will look something like this, showing latencies in milliseconds:
time_namelookup: 0.060
time_connect: 0.098
time_appconnect: 0.000
time_pretransfer: 0.099
time_redirect: 0.000
time_starttransfer: 0.144
----------
time_total: 0.144
The value for
time_connect
generally represents the latency of the client’s connection to the nearest Google datacenter. If this connection is slow, you can troubleshoot further using
traceroute
to determine which hop on the network causes the delay, as packets traverse your ISP’s network and Google’s production network to reach the Google frontend server.
You can run tests from clients in different geographical locations. Google Cloud Platform will automatically route requests to the closest data center, which will vary based on the client’s location.
If packets reach the Google frontend server with acceptable latency, then you need to troubleshoot the source of latency problems within App Engine’s serving infrastructure or your application code or configuration.
Look at your logs for the corresponding request in the Google
Developers Console
. It may help to print out the time when you ran the
curl
command.
The key field is the
wall clock time
for the request. T
his value doesn't include time spent between the client and the server that's running your application. You can calculate the time that the request spent within App Engine's serving infrastructure before reaching your application: subtract the time to reach the Google frontend server from the wall clock time.
All App Engine applications are hosted in the United States, unless their app ID is prefixed by
e~
, which signifies that the application is hosted in Europe. If your client is in a different geographical region from your application, you will see a significant delay as packets traverse Google’s internal network between the Google frontend server and the server running your application. You will see this delay, for example, if your application is in the US and your client is in Europe or Asia. One of the advantages of hosting your application on App Engine is that this latency is usually significantly less than if you used the public Internet to route requests to an application in another region.
Assuming that your client is in the same geographical region as your application, you can expect the App Engine serving infrastructure to add negligible latency.
Here are some additional troubleshooting tips for isolating latency problems:
Was the latency caused by the time to
start up
a new instance of your application? You will see these start-ups flagged as loading requests in the logs. Try running your tests with the default
scheduler settings
. In most cases, the default scheduler settings will provide an optimal tradeoff between cost and latency. If you make changes to these settings, run load tests to determine the impact. Also consider adding
resident instances
.
Do the logs show high
pending time
for a slow request? This is the time that your request spends in the queue waiting for an instance to be available. You can usually avoid by reverting to the default
scheduler settings
. In some cases, you may need to add
resident instances
.
Are you serving a
static file
or using the
Blobstore API
to serve the request? Both of these approaches use a serving path that doesn't run any of your application’s code. Run separate tests for latency in these cases. Use
Google’s high performance image serving infrastructure
to reduce latency.
Do slow requests have a large response size, according to the logs? If so, determine whether there is a bandwidth limitation between your client and Google.
For consistency during tests, ensure that your requests aren't cached. When running in production, add a
Cache-Control
HTTP header to your response in order to improve latency.
Does your request make API calls? If so, use
Appstats
to determine the time taken for API calls.
Do you see a high value in the
CPU milliseconds field in your logs
? If so,
your request might be CPU-bound. Using a
higher instance class
may reduce latency.
Are you using HTTPS or a custom domain? Compare latency with HTTP requests to your appspot.com domain to isolate whether the latency is caused by these factors.
If you think the slowdown occurs in your code, add
application logging
to record timing events in your code.
If you have purchased a
support package
, you can
contact Google Cloud Platform's support team
for further help. Here is information you should have at hand to help us quickly diagnose latency caused by network issues:
Your IP address. You can get that by looking at the Developers Console logs for a request sent to App Engine.
The URL of your App Engine application.
The IP address to which the domain name from the above URL resolves to.
The output of
ping
and
traceroute
from your client to the above IP address.
The output from running the
curl
command, shown earlier in this blog post. You may want to run this a few times to ensure you have a representative result.
The Developers Console logs for the above request.
If you’d like to explore this topic further, check out our
methodology for YouTube video quality
and read about
Mobile analysis in PageSpeed Insights
.
- Posted by John Lowry, Technical Account Manager
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Firebase Blog
Apigee Blog
Popular Posts
Understanding Cloud Pricing
World's largest event dataset now publicly available in BigQuery
A look inside Google’s Data Center Networks
New in Google Cloud Storage: auto-delete, regional buckets and faster uploads
Enter the Andromeda zone - Google Cloud Platform’s latest networking stack
Labels
Announcements
193
Big Data & Machine Learning
134
Compute
271
Containers & Kubernetes
92
CRE
27
Customers
107
Developer Tools & Insights
151
Events
38
Infrastructure
44
Management Tools
87
Networking
43
Open
1
Open Source
135
Partners
102
Pricing
28
Security & Identity
85
Solutions
24
Stackdriver
24
Storage & Databases
164
Weekly Roundups
20
Feed
Subscribe by email
Demonstrate your proficiency to design, build and manage solutions on Google Cloud Platform.
Learn More
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow