Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Verifying PostgreSQL backups made easier with new open-source tool
Wednesday, July 11, 2018
By Brett Hesterberg, Product Manager and Alexis Guajardo, Senior Software Engineer
When was the last time you verified a database backup? If that question causes you to break into a cold sweat, rest assured you’re not alone.
Verifying backups should be a common practice, but it often isn’t. This can be an issue if there’s a disaster or—as is more likely at most companies—if someone makes a mistake when deploying database changes.
One industry survey indicates
that data loss is one of the biggest risks when making database changes.
PostgreSQL Page Verification Tool
At
Google Cloud Platform
(GCP), we recently wrote a tool to fight data loss and help detect data corruption early in the change process. We made it open source, because data corruption can happen to anybody, and we’re committed to making code available to ensure secure, reliable backups. If you
use Google Cloud SQL for PostgreSQL
, then you’re in luck—we’re already running the PostgreSQL Page Verification Tool on your behalf. It’s also available now as open source code.
This new
PostgreSQL Page Verification tool
is a command-line tool that you can execute against a Postgres database. Since PostgreSQL version 9.3, it’s been possible to enable checksums on data pages to avoid ignoring data corruption. However, with the release of this utility, you can now verify all data files, online or offline. The Page Verification Tool can calculate and verify checksums for each data page.
How the Page Verification tool works
To use the PostgreSQL Page Verification tool, you must enable checksums during initialization of a new PostgreSQL database cluster. You can’t go back in and do it after the fact. Once checksums are turned on, the Page Verification tool computes its own checksum and compares it to the Postgres checksum to confirm that they are identical. If the checksum does not match, the tool identifies which data page is at fault and causing the corruption.
The Page Verification Tool can be run against a database that’s online or offline. It verifies checksums on PostgreSQL data pages without having to load each page into a shared buffer cache, and supports subsequent segments for tables larger than 1GB.
The tool skips Free Space Map, Visibility Map and pg_internal.init files, since they can be regenerated. While the tool can run against a database continuously, it does have a performance overhead associated with it, so we advise incorporating the tool into your backup process and running it on a separate server.
How to start using the PostgreSQL Page Verification tool
The Page Verification tool is integrated into Google Cloud SQL, so it runs automatically. We’re using the tool at scale to validate our customers’ backups. We do the verification process on internal instances of Cloud SQL to make sure your database doesn’t take a performance hit.
The value of the PostgreSQL Page Verification Tool comes from detecting data corruption early to minimize data loss resulting from data corruption. Organizations that use the tool and achieve a successful verification have assurance of a useful backup in case disaster strikes.
At Google, when we make a database better, we make it better for everyone, so the PostgreSQL Page Verification tool is available to you via open source. We encourage Postgres users to download the tool at
Google Open Source
or
GitHub
. The best detection is early detection, not when you need to restore a backup.
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Firebase Blog
Apigee Blog
Popular Posts
World's largest event dataset now publicly available in BigQuery
A look inside Google’s Data Center Networks
Understanding Cloud Pricing
Enter the Andromeda zone - Google Cloud Platform’s latest networking stack
Getting your data on, and off, of Google App Engine
Labels
Announcements
193
Big Data & Machine Learning
134
Compute
271
Containers & Kubernetes
92
CRE
27
Customers
107
Developer Tools & Insights
151
Events
38
Infrastructure
44
Management Tools
87
Networking
43
Open
1
Open Source
135
Partners
102
Pricing
28
Security & Identity
85
Solutions
24
Stackdriver
24
Storage & Databases
164
Weekly Roundups
20
Feed
Subscribe by email
Demonstrate your proficiency to design, build and manage solutions on Google Cloud Platform.
Learn More
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow