Introduction

Welcome

Backup is the easiest and most flexible backup, archive and rotate tool. It’s a beginning-to-end solution for scheduled backups in a clean ruby package that is simple use and powerful when customized. Backup allows you to specify each of the following options:

  • what is being archived (files, folders, arbitrary scripts)
  • how it’s being archived (tar gzip, bz2)
  • where the archive is going (multiple backup servers? easy)
  • how the archive is going to get there (scp, ftp, mv)
  • where is will be stored when it gets there
  • how it’s going to be rotated when it gets there (grandfather-father-son, etc)
  • how often will this process happen (customizable cycles)
  • what happens to the working copy after the process (recreate files, folders etc. restart daemons)

Backup is a collection of scripts that is complete enough to save you time, but flexible enough to work with any situation. This is an early version of Backup. Please feel free to ask questions on how to use Backup or post suggestions for this manual in the mailing lists

Housekeeping

Prerequisites

Backup makes the following assumptions about your machines:

  • server and client understand POSIX commmands
  • passwords and paths are the same on each backup server

Backup depends on the following libraries:

These are listed as dependencies in the gem file so you should be prompted to install them when you install Backup.

Using RubyGems

If you have RubyGems installed, installing Backup is simple:

sudo gem install backupgem

Using svn

If you prefer, you can checkout backupgem from the RubyForge Repository. Feel free to browse the releases or trunk here.

svn checkout svn+ssh://rubyforge.org/var/svn/backupgem

License Information

Backup is made available under either the BSD license, or the same license Ruby (which, by extension, also allows the GPL as a permissable license as well). You can view the full text of any of these licenses in the doc subdirectory of the Backup distrubtion. The texts of the BSD and GPL licenses are also available online: BSD and GPL.

If you desire permission to use either Backup in a manner incompatible with these licenses, please contact the copyright holder Nate Murray in order to negotiate a more compatible license.

Support

Mailing lists, bug trackers, feature requests, and public forums are all available courtesty of RubyForge at the BackupGem project page.

Mailing Lists

List Name   Desc.
backupgem-users subscribe / unsubscribe The BackupGem users list is devoted to the discussion of and questions about the usage of Backup. If you can’t quite figure out how to get a feature of Backup to work, this is the list you would go to in order to ask your questions.
backupgem-devel subscribe / unsubscribe The Backup developers list is devoted to the discussion of Backup’s implementation. If you have created a patch that you would like to discuss, or if you would like to discuss a new feature, this is the list for you.

About the Author

Backup was written by Nate Murray.

Nate currently works at an internet retailer in Southern California. Feel free to send him compliments, money, praise, or new feature patches. You can also send questions and suggestions. However, for bug reports and general feature requests, please use the trackers on the BackupGem project page.

Special Thanks

Special thanks to:

  • Matt Pulver for help with various technical problems and ideas.
  • Jamis Buck for writing Capistrano. Capistrano provided the inspiration and some code for this work. Additionally, the Net::SSH manual provided partial inspiration for this manual.
  • Matthew Lipper for writing the Runt Ruby Temporal Expressions Library
  • why the lucky stiff for inspiration and some of the code to generate this manual.

How Backup Works

A Typical Backup

A typical backup has the following sequence:

  • content
  • compress
  • encrypt
  • deliver
  • rotate
  • cleanup

This order is the default, however, like most things it is customizable. Think of it like a pipline: the input of each step is the output of the last step. Each of these things are specified in a recipe file which is describe below.

CLI

  Usage: ./backup [options]
  Recipe Options -----------------------
      -r, --recipe RECIPE              A recipe file to load.

      -g, --global FILE                Specify the global recipe file to work
                                       with. Defaults to the file +global.rb+
                                       in the directory of +recipe+
      -s, --set NAME=VALUE             Specify a variable and it's value to
                                       set. This will be set after loading all
                                       recipe files.

Backup is still in an early version. For now, I run the script directly from its bin folder in $GEMSROOT. This will be moved to something like /usr/local/lib in a future version. Patches to do this are welcome.

For now, you can run backup like so:

  /usr/local/lib/ruby/gems/1.8/gems/backupgem-0.0.2/bin/backup --recipe /path/to/examples/mediawiki.rb

Usually, you would set this as a cron job or something similar.

Backup Recipe File Format

Recipes Explained

  • The Backup Recipe format is pure ruby code. Anything that is valid ruby is valid in the recipe file. There are, however, a number of shortcuts that will make your life easier.
  • Each of the steps are specified as an action. (An action is really nothing more than a method that becomes defined in the Actor instance. See the API docs if you’re interestd.)
  • You may create “hook” actions for any of the actions. So if you define a method before_content it will be called immediately before #content is called. A method named after_rotation would be called after #rotation. This may not always be needed as you can customize the rotation order by setting the action_order. See the recipes/standard.rb file for examples.
  • In each action, the variable last_result is set. This is the return value of the previously called method in the chain. Note that this includes the output of the “hook” methods.
  • All configuration variables are available to actions via the hash c[]. For example, the backup path is available to your actions as c[:backup_path].

Variables

Setting variables is the bulk of the work you have to do when configuring your Backup scripts. These first two variables are required for all configurations.

The second variable could be omitted completely if the code was updated to use tmpdir. This is currently a TODO.

Name Desc. Example
:backup_path The path to place backup archives. set :backup_path, "/var/local/backups/mediawiki"
:tmp_dir Specify a directory that backup can use as a temporary directory. Default /tmp. set :tmp_dir, File.dirname(__FILE__) + "/../tmp"

The user running the script must have write permissions for each of these directories.

Content

The first step in any backup is to locate (or create) the content that is to be backed up. Backup provides a couple of shortcuts for common ways to locate content and allows you to arbitrarily define your own. Some typical types of content are:

  • a particular file
  • a particular folder
  • the contents of a particular folder
These could be specified like so:
  action :content, :is_file   => "/path/to/file"               # content is a single file
  action :content, :is_folder => "/path/to/folder"             # content is the folder itself 
  action :content, :is_contents_of => "/path/to/other/folder"  # content is folder/* , recursive option
If you want :content to be a series of shell commands just pass #action a block:
  action(:content) do 
    sh "echo \"hello $HOSTNAME\"" 
    sh "mysqldump -uroot database > /path/to/db.sql" 
    "/path/to/db.sql" # make sure you return the full path to the folder/file you wish to be the content 
  end

Compress

Next you may want to compress your content. Again, there are a few one-liners for common cases and you can create your own.
  action :compress, :method => :tar_bz2  # actually calls a method named tar_bz2 with output of ":content" ( or ":after_content" ) 
  # or
  action :compress, :method => :tar_gzip
Again, you can create your own.
  action(:compress) 
    sh "my_tar  #{last_result} #{last_result}.tar" 
    sh "my_bzip #{last_result}.tar #{last_result}.tar.bz2" 
    last_result + ".tar.bz2" 
  end

Encrypt

Encryption is available to you, if you wish to use it.
  set :encrypt, true                              # default is +false+
  set :gpg_encrypt_options, "--default-recipient" # default is an empty string
  action :encrypt, :method => :gpg # default, none
or your own:
  action(:encrypt)
    sh "gpg #{c[:gpg_encrypt_options]} --encrypt #{last_result}" 
    last_result + ".gpg"  
  end

I would recommend that you think seriously about how you wish to manage your keys for this backup process. If you are backing up encrypted data then you need to backup your keys or else risk losing access to your data. Secure key management is beyond the scope of this document.

Delivery

Action

Delivery is supported via scp, ftp, and mv

Full FTP support is unfinished. Currently, only scp and mv fully support rotation

  action :deliver, :method => :scp
  action :deliver, :method => :ftp
  action :deliver, :method => :mv
The :mv action is defined (in backup/recipes/standard.rb) like any user-defined action:
  action(:mv) do
    sh "mv #{last_result} #{c[:backup_path]}/" 
    c[:backup_path] @ "/" @ File.basename(last_result)
  end

Variables

Name Desc. Example
:servers An array of host names to deliver the data to.
TODO this currently supports one 1 server.
set :servers, %w{ localhost }
:ssh_user The name of the ssh user on the foreign server. Default ENV[‘USER’]. set :ssh_user, ENV['USER']
:identity_key The path to the key to use when ssh’ing into a foreign server. set :identity_key, ENV['HOME'] + "/.ssh/id_rsa"

Rotate

Rotation of your backups is a way to keep snapshot copies of your backups in time while not keeping every single backup for every single day. Currently the only form of rotation Backup supports is grandfather-father-son. See Wikipedia if you are unfamiliar with how this works.

  set :rotation_method,  :gfs # this is the default. you don't need to set it and there are no other supported options

By deafult, a son is created daily, unless it is a day to create a father or grandfather. It is assumed that every time you run Backup you want to create a backup. Therefore, if you do not want to a son (etc), do not run the program. You can specify when a son is promoted to a father by the following variable:

  set :son_promoted_on,    :fri

You specify when fathers are promoted to grandfathers by something like the following

  set :father_promoted_on, :last_fri_of_the_month

Valid arguments for specifying these promotions are as follows:

  • :mon-:sun – A symbol of the abbreviation of any day of the week
  • :last_[mon-sun]_of_the_month – A symbol, replacing [mon-sun] with the abbreviation for the day of the weeks.
    Example: :last_thu_of_the_month.
  • Any valid Runt object.
Representing these temporal ranges is done internally by using Runt. You are, therefore, allowed to pass in your own arbitrarily complex Runt object. For example, say that you wanted to promote to father on Monday, Wednesday and Friday. You could do the following:
  mon_wed_fri = Runt::DIWeek.new(Runt::Mon) | 
                Runt::DIWeek.new(Runt::Wed) | 
                Runt::DIWeek.new(Runt::Fri)
  set :son_promoted_on, mon_wed_fri

See the Runt documentation for more information on this.

You can set how many of each rank to keep:

  # assuming daily backups...
  set :sons_to_keep,         14   # this would be two weeks
  set :fathers_to_keep,       6
  set :grandfathers_to_keep,  6

Examples

Good things come in threes

Here we will cover three examples.

  • a super-simple backup to a local directory. This will show how easy things can be.
  • a more complex implementation. This will show some of the variables you can set and show how to use a foreign server.
  • an even more complex configuration. This will show how to define your own methods and set more advanced variables.

Example One | Backup a Folder of Logs

Our first example will be backing up a folder of logs. Say we have a folder /var/my_logs/ and it is full of log files.

What we want to do is:

  • move out all the old log files
  • compress them and store them in a local folder
  • store 2 weeks of daily backups (sons)
  • store a weekly backup (father) going back 6 weeks
  • and create a monthly backup on the last friday of every month (grandfather) for 6 months
Thankfully, this is incredibly simple:
  set :backup_path, "/var/local/backups/my_old_logs" 
  set :tmp_dir,     "/tmp" # this is the default so you actually dont have to specify it
  action :content,  :is_contents_of => "/var/my_logs"

And thats it!

Notice a couple of things here.

  1. Each time we set a variable that becomes available to the actions as c[:var]
  2. In this case, make sure that :backup_path and :tmp_dir are writable by the user that is running the backup script.

Example Two | SQL Backup

Our second example will be backing up a MediaWiki installation. Say we have a MySQL database named mediawiki.

This time we want to:

  • create a dump of the database every day
  • compress this backup and store it in a local folder
  • store 2 weeks of daily backups (son) [same as last time]
  • store “father” backups every Monday, Wednesday and Friday going back 6 weeks
  • and create a monthly backup on the last Friday of every month (grandfather) for 6 months
  set :backup_path, "/var/local/backups/mediawiki" 

  action(:content) do
    dump = c[:tmp_dir] + "/mediawiki.sql" 
    sh "mysqldump -uroot mediawiki > #{dump}" 
    dump # make sure you return the name of the file
  end

  action :deliver,  :method => :scp
  action :rotate,   :method => :via_ssh

  set :servers,           %w{ my.server.com }

  set :son_promoted_on,    :sun
  set :father_promoted_on, :last_sun_of_the_month

  set :sons_to_keep,         21  
  set :fathers_to_keep,      12
  set :grandfathers_to_keep, 12

A couple things to note:

  1. In this example, it is assumed that the current user running the script has ssh keys configured for password-less login to my.server.com. If you want to change the user to login as set the variable :ssh_user. You can also specify a key with :identity_key.
  2. Number 1 is the only thing

Example Three | Something more complex

By now it should be easy to see what is going on here:

  action :content,  :is_file => "/path/to/file.abc" 
  action :compress, :method  => :my_tar_gzip

  action(:my_tar_gzip) do
    name = c[:tmp_dir] + "/" + File.basename(last_result) + ".tar.gzip" 
    sh "tar -czv --exclude .DS* --exclude CVS  #{last_result} > #{name}" 
    name # make sure you return the name of the tar.gzip file
  end

  set :encrypt, true

  action :deliver,  :method => :scp  
  action :rotate,   :method => :via_ssh

  set :ssh_user,          "backup_user" 
  set :identity_key,      ENV['HOME'] + "/.ssh/backup_key"

Known Bugs and Limitations

TODO

What is left to do:

  • Finish FTP support
  • Add support for multiple backup servers
  • Add in better logging
  • Remove all the tmp_dir references. use and test the standard library tmpdir
  • Continue testing it

If you find yourself writing the code for Backup to fix these things, please consider submitting your patch for the benefit of the community.

BUGS

  • You can’t return in the user-defined actions for some reason. I think this has to do with the instance_eval. But still, I wouldn’t think it would matter. I’d be interested in any suggestions on how to fix this.
  • Please submit other bugs via the trackers at the BackupGem project page.