Phindee users recently got the ability to “like” happy hours. Up until that point, all my happy hour data was safely stored in a version controlled seed.rb
file, but now I was dealing with data that was dynamically generated and not being backed up anywhere. And that is not a good thing.
So I went over to ruby-toolbox.com to familiarize myself with the various backup tools available for Ruby projects. The Backup gem caught my eye as it was (and is) the most popular one by far. After reading a bit about it, I was impressed by its ease of use and its extensive list of features. I knew I had to try it out.
Having now used it for a few weeks, I’d like to explain how I set it up, so you can take advantage of it as well.
Setting Up Backup
Setting up Backup is as straightforward as it gets. Log in to the VPS running your database and install Backup:
1
|
|
You can then run backup
to familiarize yourself with all the commands it provides. We’ll start out by creating a Backup model, which is simply a description of how a backup will work. If you run
1
|
|
you’ll see all the options available for describing how we want our backup to function. Below is the command and options I used to create my model:
1
|
|
As you can see, I’m first using the --trigger
option to create a model called db_backup
. Then I’m using the --databases
option to specify that I’ll be backing up a PostgreSQL database. (Basides PostgreSQL, Backup also supports MySQL, MongoDB, Redis, and Riak.)
Next, I use --storages
to tell Backup how to perform the backup itself. By specifying scp
, I’m saying that the backup file should be stored on a secondary VPS, and it should be transferred there via SCP. (Ideally, your secondary VPS should be in a location that’s different from the VPS running your database.) In addition to SCP, Backup also supports rsync, FTP/SFTP, S3, Dropbox, and a few others.
I then specify that I want my backup to be compressed with gzip (you could also use bzip2, if you’d like), and finally, I tell Backup to notify me via email if the backup succeeded or failed. If you dislike email, your other options include Twitter, Prowl, Campfire, Hipchat, and others.
Once this command runs, it’ll create a ~/Backup
directory containing two files: config.rb
and models/db_backup.rb
(named after our trigger). The latter will hold configuration specific to the model we just created, while the former is for common configuration across multiple models. Since we’re only creating a single model, we’ll only modify the models/db_backup.rb
file, which will already contain some code corresponding to the options we just specified.
If you ran the command above, the file should look something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
Since I store my database information in the database.yml
file and my email and VPS information in application.yml
, I added two lines in the beginning to load the necessary login information from these files using the load_file()
method from the YAML module. I recommend you do the same because it’s best to keep these things in a dedicated file, instead of hard-coding them in every time.
Let’s now go over our db_backup
model, which consists of four sections. Because we specified PostgreSQL for the --databases
option, the first section contains configuration that is specific to PostgreSQL. It collects our database name, username, password, and host, along with an array of tables to back up. This array is optional and should be used only if you don’t want your entire database backed up. (I used it because the ip_addresses
table is the only table I’m interested in backing up since the data for all my other tables is saved in seed.rb
.)
The second section describes how to connect to our secondary VPS. After setting the username, password, IP address, and port, I specify the path where the backups will be stored, and I tell Backup to keep only the five most recent ones. The third section simply tells Backup to use gzip for compression, while the last contains settings for setting up email notifications, which tell Backup to only send an email if a warning or a failure occurs.
Once our db_backup.rb
file is configured, we can run it with the following command:
1
|
|
If all went well, you should be able to find a gzipped backup file on your secondary VPS.
Setting Up Whenever
Okay, this is all great, but wouldn’t it be cool if the backup was done automatically without you having to trigger it? Well, this is possible with a tool called cron. If you’re not familiar with it, cron is a scheduling utility that allows you to run tasks (which are known as cron jobs) at specified times. You can use it to automate any task that needs to be run at regular intervals. If you’ve never used it before, DigitalOcean has a good introductory article that’s definitely worth a read.
To write our cron jobs, we’ll be using a gem called Whenever, because it allows us to write them in a simpler, more expressive Ruby syntax, instead of the standard cron syntax.
Go ahead and install Whenever on the server running Backup:
1
|
|
When that finishes, create a /config
directory for Whenever inside ~/Backup
:
1 2 |
|
Then run:
1
|
|
This will create a schedule.rb
file in ~/Backup/config
for writing your cron jobs. Below is the code I added to mine:
1 2 3 |
|
The code pretty much explains itself: everyday at 11pm, cron will run the backup perform -t db_backup
command. If you’d like to see this converted to cron syntax, run whenever
:
1 2 |
|
This is known as your crontab (which stands for cron table), and it lists all the jobs cron is scheduled to run, along with the time and day they’ll run.
The first column, for example, defines the minute (0-59) at which the command will run, while the second defines the hour (0-23) in military time. The third column defines the day of the month, the fourth defines the month itself (1-12), and the fifth is used to specify the day of the week (with Sunday being represented by both 0 and 7).
Because running whenever
didn’t actually write our job to crontab, we’ll need to run
1
|
|
to do so. Having done that, cron will now know about our job, and it’ll get executed at the specified time and day. When it runs, it’ll also log its activity in a ~/Backup/config/cron.log
file for future reference.
Hooking Things Up with Capistrano
In order to make it easier to edit these files in the future, I decided to recreate them on my local computer and store them in my app’s /config
directory in a folder called /backup
, which means they’ll now be under version control as well. And since I use Capistrano for deployment, I wrote two tasks to automate the process of uploading these files back to the server. They reside in a file called backup.cap
in my app’s /lib/capistrano/tasks
directory:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
And inside my /config/deploy.rb
file, I then have the following definition for the backup_path
variable:
1 2 3 4 5 |
|
(If this is all new to you, feel free to read my posts explaining how to configure Capistrano and how to write Capistrano tasks to quickly get up to speed.)
And with that, our backup functionality is complete. You’ll now have a backup of your database stored on a secondary VPS every 24 hours without you having to lift a finger! And it even notifies you if it fails!
Life is good.