Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Aerospike demonstrates RAM-like performance with Local SSDs
Wednesday, January 21, 2015
Today’s post is by Sunil Sayyaparaju, Director of Product and Technology at
Aerospike
, the open source, flash-optimized, in-memory NoSQL database.
Aerospike, now available as a
Click to Deploy of Aerospike
on Google Compute Engine, is an open source NoSQL database built to push the limits of modern processors and storage technologies, including
SSDs
, and developers are increasingly choosing NoSQL databases to power cloud applications. In a few minutes, you get an Aerospike cluster deployed to your specifications. Each node is configured with Aerospike Server Community Edition and Aerospike Management Console. The available tuning parameters can be found in the
Click to Deploy Aerospike documentation
.
In addition to the rapid deployment provided by Click to Deploy, we are also excited by the results we are seeing in our performance testing on Google Cloud Platform. Back in 2009, the founders of Aerospike saw that SSDs would be the future of storage, offering data persistence with better read/write access-times than rotational
hard disks
, greater capacity than RAM and a price/performance ratio that would fuel the development of applications that were previously not economically viable to run. The current proliferation of SSDs, now available on Google Compute Engine, validates this vision and this unprecedented level of price/performance will enable a new category of real-time data intensive applications.
In this post, we will showcase the performance characteristics of Local SSDs on
Google Compute Engine
and demonstrate RAM-like performance with 15x storage cost advantage using Local SSDs. We repeated recent tests published in “
Aerospike Hits 1 Million Writes Per Second With Just 50 Nodes
,” using Local SSDs instead of RAM.
Aerospike certifies Local SSDs on Google Compute Engine
When the first Aerospike customers deployed the Aerospike database in 2010, there was no way to benchmark SSDs. The standard fio (Flexible IO) tool for benchmarking disks did not fit our needs, so Aerospike developed and open sourced the Aerospike Certification Tool (
ACT
) for SSDs. This tool simulates typical database workloads:
Reads small objects (default 1500 bytes) using multiple threads (default 16).
Writes large blocks (default 128KB) to simulate a buffered write mechanism in DBMS.
Reads large blocks (default 128KB) to simulate typical background processing.
ACT is used to test SSDs from different manufacturers, understand their characteristics and select configuration values that maximize the performance of each model. The test is run for 24-48 hours because the characteristics of an SSD change over time, especially in the initial few hours. In addition, different SSDs handle garbage collection differently, resulting in a wide variability in performance. To help customers select drives that pass our performance criteria, based on results of ACT, Aerospike certifies and publishes this list of
recommended SSDs
.
Aerospike Certification Tools (ACT) for SSDs Setup
The following server and storage configurations were used to run the ACT test:
Machine: n1-standard-4 with 1 Local SSD provisioned (4 vCPU, 15 GB memory)
SSD size: 375GB
Read/Write size: 1500 bytes (all reads hit disk, but writes are buffered)
Large block read size: 128KB
Load: 6000 reads/s, 3000 writes/s, 71 large block reads per sec
ACT results show that 95% of Local SSD reads complete in under 1 ms
The results are shown in the graph below. The y axis shows the percentage of database read transactions that take longer than 1, 2, 4, or 8 milliseconds to complete. The x axis shows how performance changes during the first few hours and how consistent performance is as the benchmark continues to run for 24 hours.
The graph shows that after the first few hours, 95% of reads complete in under 1 ms.
only 5% take > 1 ms
only 3% take > 2 ms
only 1% take > 4 ms
a negligible number take > 8 ms
(Note: % of reads >1ms is a superset of % of reads >2ms which is a superset of % of reads >4ms and so on.)
Similar to other SSDs that Aerospike has tested, the performance of Local SSDs in Google Compute Engine starts out very high and, as with normal SSD characteristics, decreases slightly over time. Performance stabilizes quickly, in about 10 hours, which based on our experience benchmarking numerous SSDs, is very good.
Comparing Aerospike performance on Local SSDs vs. RAM
An
earlier post
showed how Aerospike hit 1 million writes per second with just 50 nodes on Google Compute Engine and 1 million reads per second with just 10 nodes running in RAM. Aerospike’s disk storage layer was designed to take advantage of SSDs, keeping in mind their unique characteristics. For this blog post, we repeated the performance test with 10 nodes, using Local SSDs instead of RAM, which yielded the following results:
15x price advantage in storage costs with Local SSDs vs RAM
Achieved roughly the same write throughput using Local SSDs compared to RAM
Achieved half the read throughput using Local SSDs compared to RAM
Aerospike delivers 15x storage cost advantage with Local SSDs vs. RAM
The table below shows the hardware specifications of the machines used in our testing. Using Local SSDs instead of RAM, we got 25x more capacity (750GB/30GB) at 1.64x the cost ($417.50/$254), for a 15x price advantage ($8.46/$0.56). We used 20 clients of type n1-highcpu-8.
Aerospike demonstrates RAM-like Latencies for Local SSDs vs. RAM
The graph below shows the percentage of reads >1ms and writes >8ms, for a number of read-write workloads.
Write latencies for Local SSDs are similar to RAM because in both cases, writes are first written in memory and then flushed to disk. Although read latencies are higher with Local SSDs, the differences are not noticeable here because most reads using Local SSDs finish under 1ms and the percentage of reads taking more than 1ms is similar for both RAM and Local SSDs.
Aerospike demonstrates RAM-like Throughput for Writes on Local SSDs vs. RAM
The graph below compares throughput for different Read-Write workloads. The results show:
1.0x write throughput (while doing 100% writes) using Local SSDs compared to RAM. Aerospike is able to achieve the same write throughput because of buffered writes, where writes are first written in memory and subsequently flushed to disk.
0.5x read throughput (while doing 100% reads) using Local SSDs compared to RAM. Aerospike is able to achieve such high performance using Local SSDs because it stores indexes in RAM and they point to data on disk. The disk is accessed exactly once per read operation, resulting in highly predictable performance.
Surprisingly, when doing 100% reads with Local SSDs, over 55% complete in under 1 ms. Most reads to SSDs may take 0.5-1ms while reads in RAM may take < 0.5ms. That may be why there is drop in read throughput without a corresponding drop in the latencies > 1ms.
Summary
This post documented results of the Aerospike Certification Test (ACT) for SSDs and demonstrated a 15x storage cost advantage and RAM-like performance with Local SSDs vs. RAM. This game changing price/performance ratio will power a new category of applications that analyse behavior, anticipate the future, engage users and monetize real-time big data driven opportunities across the Internet.
You can
Deploy an Aerospike cluster
today by taking advantage of the Google Cloud Platform
free trial
with support for Standard Persistent Disk and SSD Persistent Disk.
-Posted by Sunil Sayyaparaju, Director of Product and Technology at Aerospike
Aerospike is the registered trademark of Aerospike, Inc.. All other trademarks cited here are the property of their respective owners.
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Firebase Blog
Apigee Blog
Popular Posts
Understanding Cloud Pricing
World's largest event dataset now publicly available in BigQuery
A look inside Google’s Data Center Networks
New in Google Cloud Storage: auto-delete, regional buckets and faster uploads
Enter the Andromeda zone - Google Cloud Platform’s latest networking stack
Labels
Announcements
193
Big Data & Machine Learning
134
Compute
271
Containers & Kubernetes
92
CRE
27
Customers
107
Developer Tools & Insights
151
Events
38
Infrastructure
44
Management Tools
87
Networking
43
Open
1
Open Source
135
Partners
102
Pricing
28
Security & Identity
85
Solutions
24
Stackdriver
24
Storage & Databases
164
Weekly Roundups
20
Feed
Subscribe by email
Demonstrate your proficiency to design, build and manage solutions on Google Cloud Platform.
Learn More
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow