It is extremely important to have a system in place to make regular, automated backups of your production database. Even if no one is maliciously targeting your data, accidents happen all the time.
In this guide, we'll use a free open-source program called Duplicity to set this up. Our goals are that this system should:
Choose an existing, or create a new Google account where you will store your backups. I like to have a separate account so my business backup data doesn't get mixed in with my personal Google account stuff.
We will refer to the Google account name as <google_account_name>
.
On the production server, we need to set up the database to dump its contents into a compressed .sql
file. To do this, we'll create a special directory for database dumps, and use the mysqldump
command.
Ubuntu should automatically come with a backup
user and group and a var/backups
directory, so let's use those!
/var/backups
and give ownership to the backup
groupsudo mkdir /var/backups/<repo name>
sudo chgrp backup /var/backups/<repo name>
backup
group read and write permissions on this directorysudo chmod g+rw /var/backups/<repo name>
backup
groupsudo usermod -aG backup <username>
This will make it easier for you to access the contents of the backup directory.
You may need to log out and then log back in for the new group membership to take effect. To check your group membership, use the command groups
.
sudo mysql -u root -p
You'll be prompted for your sudo
password first, followed by your root MySQL user password.
We don't want to make ourselves vulnerable by giving any more permissions than we absolutely have to. Thus, we will create a new database user account with read-only privileges. To stay consistent, I'm also calling this database user backup
. Pick a very strong password for <db_password>
.
CREATE USER backup@localhost IDENTIFIED BY '<db_password>';
GRANT SELECT,EVENT,TRIGGER,SHOW DATABASES ON *.* TO backup@localhost;
To quit the database shell, use the command quit;
.
Test out the dump command (replace <db_password>
with the password you set earlier for the new MySQL user):
mysqldump --single-transaction --routines --events --triggers --add-drop-table --extended-insert --max-allowed-packet=1000000000 -u backup -h 127.0.0.1 -p<db_password> --all-databases | gzip -9 | sudo tee /var/backups/<repo name>/sql/all_$(date +"%Y_week_%U").sql.gz > /dev/null
-p
flag and your password.This command will dump all databases to a single file, labeled with the year and current week number. Each time we run this, it will update the current dump file. However when a new week commences, it will end up creating a new file instead. Thus, we maintain a history of weekly snapshots of our databases. You can adjust the date portion to make these snapshots more or less frequent, depending on the size of your database and the space you're willing to allocate for these snapshots.
We use sudo tee
here to write the output to a directory for which mysql
would otherwise not have the proper permissions under our shell.
The current stable version as of June 2017 is 0.7.13.1.
sudo add-apt-repository ppa:duplicity-team/ppa
sudo apt-get update
sudo apt-get install duplicity
PyDrive is a library that handles the OAuth2 negotiation between Duplicity and the Google Drive API. We'll install it with the pip
package manager for Python, which we need to install first:
sudo apt-get install python-pip
Now we can install pydrive:
sudo -H pip install pydrive
Do this through Google's Developer Console. For help with this, see:
PyDrive uses this file to store credentials and configuration settings for the Google API.
mkdir ~/.duplicity
nano ~/.duplicity/credentials
Add the following:
client_config_backend: settings
client_config:
client_id: <your client ID>.apps.googleusercontent.com
client_secret: <your client secret>
save_credentials: True
save_credentials_backend: file
save_credentials_file: /home/<username>/.duplicity/gdrive.cache
get_refresh_token: True
Replace <username>
with your non-root user account name. Replace <your client ID>
and <your client secret>
with the values obtained in the previous step.
GOOGLE_DRIVE_SETTINGS
environment variableexport GOOGLE_DRIVE_SETTINGS=/home/<username>/.duplicity/credentials
I would also recommend adding GOOGLE_DRIVE_SETTINGS
to sudo environment variables:
sudo visudo
Add the following line at the end:
# PyDrive settings
Defaults env_keep += "GOOGLE_DRIVE_SETTINGS"
You will need a GPG key to encrypt your backup data before it is sent to Google Drive. To generate the key, simply run the command:
gpg --gen-key
Follow the instructions it provides, choosing the defaults for key type, size, and expiration. Make sure you choose a good passphrase. If it gets stuck with a message about "not enough entropy", you can try running sudo apt-get install rng-tools
(log into a separate terminal to do this). The installation itself should generate enough entropy that GPG can generate a truly random key. See this article.
The GPG "fingerprint" will be displayed after this completes. You will need the primary public key id from this fingerprint. This is simply the 8-digit hex code after the /
on the line that begins with pub
. See this explanation.
sudo nano /root/.passphrase
sudo chmod 700 /root/.passphrase
In this file, simply add:
PASSPHRASE="<my passphrase>"
If you lose your GPG key, your encrypted backups will become useless. So, you should back up your GPG key to some place besides your VPS.
For example, to backup to your local machine:
gpg --list-keys
gpg -ao ~/gpg-public.key --export <gpg_public_key_id>
gpg --list-secret-keys
gpg -ao ~/gpg-private.key --export-secret-keys <gpg_private_key_id>
Then on your local machine:
scp <username>@<hostname>:~/gpg-public.key ~/gpg-public.key
scp <username>@<hostname>:~/gpg-private.key ~/gpg-private.key
See this article for more information on backing up your GPG key. Depending on the nature of your data, you may want to consider putting the private portion of your GPG key on a piece of paper, and then storing that piece of paper in a safe.
Remove these backups from your home directory on the remote machine:
rm ~/gpg-private.key ~/gpg-public.key
We'll create some test files, just to check that we can transfer them to Google Drive using Duplicity successfully.
cd ~
mkdir test
touch test/file{1..100}
duplicity ~/test gdocs://<google_account_name>@gmail.com/backup
Follow the verification link it creates, and copy-paste the verification code you receive back into the prompt. Duplicity should store the auth token it creates in /home/<username>/.duplicity/gdrive.cache
so that we don't have to do the verification step again (and so our system can automatically do this every night without our input).
You should see three files show up in your Google Drive backup directory:
duplicity-full.<time>.manifest.gpg
duplicity-full-signatures.<time>.sigtar.gpg
duplicity-full.<time>.vol1.difftar.gpg
Delete these files, so that Duplicity won't try to synchronize them when we change our target path for the real data.
duplicity --encrypt-key <gpg_public_key_id> /var/backups/<repo name>/sql gdocs://<google_account_name>@gmail.com/backup
cron
scriptWe'll use a cron
script to automatically perform the database dumps and run Duplicity.
This will run every night, creating incremental backups of everything in our target path. Duplicity by default tries to back up ALL files in the target path.
--exclude
parameter so that Duplicity ignores everything except the directories we include via --include
. You can use multiple --include
parameters to include multiple directories.Instead of adding to our normal crontab
, we'll create a dedicated cron script in the cron.daily
directory:
sudo nano /etc/cron.daily/duplicity-inc
The -inc
stands for "incremental". Add the following:
#!/bin/sh
test -x $(which duplicity) || exit 0
. /root/.passphrase
export PASSPHRASE
export GOOGLE_DRIVE_SETTINGS=/home/<username>/.duplicity/credentials
# This lets the script find your GPG keys when it is running as root
export GNUPGHOME=/home/<username>/.gnupg
# Run MySQL dump. This will create a weekly file, and then update the file every additional time this script is run
mysqldump --single-transaction --routines --events --triggers --add-drop-table --extended-insert --max-allowed-packet=1000000000 -u backup -h 127.0.0.1 -p<password> --all-databases | gzip -9 > /var/backups/<repo name>/sql/all_$(date +"%Y_week_%U").sql.gz
# Performs an incremental backup by default. Since we create a new dump file every week, we have a history
# of weekly snapshots, and the current week is incrementally updated each day.
duplicity --encrypt-key <gpg_public_key_id> /var/backups/<repo name>/sql gdocs://<google_account_name>@gmail.com/backup
Again, be sure to replace things in <>
placeholders with their actual values.
sudo chmod 755 /etc/cron.daily/duplicity-inc
This will run once a week, creating a full backup and clearing out all but the last three full backups to save space. Again, you can adjust this frequency and number of backups to retain, to your situation.
sudo nano /etc/cron.weekly/duplicity-full
In this file, write:
#!/bin/sh
test -x $(which duplicity) || exit 0
. /root/.passphrase
export PASSPHRASE
export GOOGLE_DRIVE_SETTINGS=/home/<username>/.duplicity/credentials
# This lets the script find your GPG keys when it is running as root
export GNUPGHOME=/home/<username>/.gnupg
# Run MySQL dump. This will create a weekly file, and then update the file every additional time this script is run
mysqldump --single-transaction --routines --events --triggers --add-drop-table --extended-insert --max-allowed-packet=1000000000 -u backup -h 127.0.0.1 -p<password> --all-databases | gzip -9 > /var/backups/<repo name>/sql/all_$(date +"%Y_week_%U").sql.gz
# Create a brand new full backup, which contains all the weekly dumps located in /var/backups/sql
duplicity full --encrypt-key <gpg_public_key_id> /var/backups/<repo name>/sql gdocs://<google_account_name>@gmail.com/backup
# Clean out old full backups
duplicity remove-all-but-n-full 3 --force gdocs://<google_account_name>@gmail.com/backup
chmod 755 /etc/cron.weekly/duplicity-full
If your tasks in these cron.*
directories aren't being run automatically for some reason (often times, due to problems with permissions), you can add these tasks to the root cron file:
sudo crontab -e
Add the lines (replace MM and HH with the minute and hour of the day you want to run the backup; try to pick odd times):
# Incremental backup every day at HH:MM
MM HH * * * /etc/cron.daily/duplicity-inc >> /var/log/duplicity.log 2>&1
# Full backup every Saturday at HH:MM
MM HH * * 6 /etc/cron.weekly/duplicity-full >> /var/log/duplicity.log 2>&1
Save and exit.
You can try downloading your backup from Google Drive back into ~/test
:
sudo duplicity gdocs://<google_account_name>@gmail.com/backup ~/test
Duplicity should fetch and decrypt your latest backup into ~/test
. Unzip it using gzip
:
ls ~/test
gzip -d <filename>.sql.gz
And check to make sure it looks ok:
head -n 100 <filename>.sql