Introduction
Welcome
Backup
is the easiest and most flexible backup, archive and rotate tool. It’s a beginning-to-end solution for scheduled backups in a clean ruby package that is simple use and powerful when customized.
Backup allows you to specify each of the following options:
- what is being archived (files, folders, arbitrary scripts)
- how it’s being archived (tar gzip, bz2)
- where the archive is going (multiple backup servers? easy)
- how the archive is going to get there (scp, ftp, mv)
- where is will be stored when it gets there
- how it’s going to be rotated when it gets there (grandfather-father-son, etc)
- how often will this process happen (customizable cycles)
- what happens to the working copy after the process (recreate files, folders etc. restart daemons)
Backup is a collection of scripts that is complete enough to save you time, but flexible enough to work with any situation.
This is an early version of Backup
. Please feel free to ask questions on how to use Backup
or post suggestions for this manual in the mailing lists
Housekeeping
Prerequisites
Backup
makes the following assumptions about your machines:
- server and client understand POSIX commmands
- passwords and paths are the same on each backup server
Backup
depends on the following libraries:
These are listed as dependencies in the gem file so you should be prompted to install them when you install Backup.
Using RubyGems
If you have RubyGems installed, installing Backup is simple:
sudo gem install backupgem
Using svn
If you prefer, you can checkout backupgem from the RubyForge Repository. Feel free to browse the releases or trunk here.
svn checkout svn+ssh://rubyforge.org/var/svn/backupgem
License Information
Backup is made available under either the BSD license, or the same license Ruby (which, by extension, also allows the GPL as a permissable license as well). You can view the full text of any of these licenses in the doc
subdirectory of the Backup distrubtion. The texts of the BSD and GPL licenses are also available online: BSD and GPL.
If you desire permission to use either Backup in a manner incompatible with these licenses, please contact the copyright holder Nate Murray in order to negotiate a more compatible license.
Support
Mailing lists, bug trackers, feature requests, and public forums are all available courtesty of RubyForge at the BackupGem project page.
Mailing Lists
List Name | Desc. | |
---|---|---|
backupgem-users | subscribe / unsubscribe | The BackupGem users list is devoted to the discussion of and questions about the usage of Backup. If you can’t quite figure out how to get a feature of Backup to work, this is the list you would go to in order to ask your questions. |
backupgem-devel | subscribe / unsubscribe | The Backup developers list is devoted to the discussion of Backup’s implementation. If you have created a patch that you would like to discuss, or if you would like to discuss a new feature, this is the list for you. |
About the Author
Backup was written by Nate Murray.
Nate currently works at an internet retailer in Southern California. Feel free to send him compliments, money, praise, or new feature patches. You can also send questions and suggestions. However, for bug reports and general feature requests, please use the trackers on the BackupGem project page.
Special Thanks
Special thanks to:
- Matt Pulver for help with various technical problems and ideas.
- Jamis Buck for writing Capistrano. Capistrano provided the inspiration and some code for this work. Additionally, the Net::SSH manual provided partial inspiration for this manual.
- Matthew Lipper for writing the Runt Ruby Temporal Expressions Library
- why the lucky stiff for inspiration and some of the code to generate this manual.
How Backup Works
A Typical Backup
A typical backup has the following sequence:
- content
- compress
- encrypt
- deliver
- rotate
- cleanup
This order is the default, however, like most things it is customizable. Think of it like a pipline: the input of each step is the output of the last step.
Each of these things are specified in a recipe
file which is describe below.
CLI
Usage: ./backup [options]
Recipe Options -----------------------
-r, --recipe RECIPE A recipe file to load.
-g, --global FILE Specify the global recipe file to work
with. Defaults to the file +global.rb+
in the directory of +recipe+
-s, --set NAME=VALUE Specify a variable and it's value to
set. This will be set after loading all
recipe files.
Backup
is still in an early version. For now, I run the script directly from its bin
folder in $GEMSROOT
. This will be moved to something like /usr/local/lib
in a future version. Patches to do this are welcome.
For now, you can run backup like so:
/usr/local/lib/ruby/gems/1.8/gems/backupgem-0.0.2/bin/backup --recipe /path/to/examples/mediawiki.rb
Usually, you would set this as a cron job or something similar.
Backup Recipe File Format
Recipes Explained
- The Backup Recipe format is pure ruby code. Anything that is valid ruby is valid in the recipe file. There are, however, a number of shortcuts that will make your life easier.
- Each of the steps are specified as an
action
. (An action is really nothing more than a method that becomes defined in the Actor instance. See the API docs if you’re interestd.) - You may create “hook” actions for any of the actions. So if you define a method
before_content
it will be called immediately before#content
is called. A method namedafter_rotation
would be called after#rotation
. This may not always be needed as you can customize the rotation order by setting theaction_order
. See therecipes/standard.rb
file for examples. - In each action, the variable
last_result
is set. This is the return value of the previously called method in the chain. Note that this includes the output of the “hook” methods. - All configuration variables are available to actions via the hash
c[]
. For example, the backup path is available to your actions asc[:backup_path]
.
Variables
Setting variables is the bulk of the work you have to do when configuring your Backup
scripts. These first two variables are required for all configurations.
The second variable could be omitted completely if the code was updated to use tmpdir
. This is currently a TODO.
Name | Desc. | Example |
---|---|---|
:backup_path | The path to place backup archives. | set :backup_path, "/var/local/backups/mediawiki" |
:tmp_dir | Specify a directory that backup can use as a temporary directory. Default /tmp . |
set :tmp_dir, File.dirname(__FILE__) + "/../tmp" |
The user running the script must have write permissions for each of these directories.
Content
The first step in any backup is to locate (or create) the content that is to be backed up. Backup provides a couple of shortcuts for common ways to locate content and allows you to arbitrarily define your own. Some typical types of content are:
- a particular file
- a particular folder
- the contents of a particular folder
action :content, :is_file => "/path/to/file" # content is a single file
action :content, :is_folder => "/path/to/folder" # content is the folder itself
action :content, :is_contents_of => "/path/to/other/folder" # content is folder/* , recursive option
:content
to be a series of shell commands just pass #action
a block:
action(:content) do
sh "echo \"hello $HOSTNAME\""
sh "mysqldump -uroot database > /path/to/db.sql"
"/path/to/db.sql" # make sure you return the full path to the folder/file you wish to be the content
end
Compress
Next you may want to compress your content. Again, there are a few one-liners for common cases and you can create your own. action :compress, :method => :tar_bz2 # actually calls a method named tar_bz2 with output of ":content" ( or ":after_content" )
# or
action :compress, :method => :tar_gzip
action(:compress)
sh "my_tar #{last_result} #{last_result}.tar"
sh "my_bzip #{last_result}.tar #{last_result}.tar.bz2"
last_result + ".tar.bz2"
end
Encrypt
Encryption is available to you, if you wish to use it. set :encrypt, true # default is +false+
set :gpg_encrypt_options, "--default-recipient" # default is an empty string
action :encrypt, :method => :gpg # default, none
action(:encrypt)
sh "gpg #{c[:gpg_encrypt_options]} --encrypt #{last_result}"
last_result + ".gpg"
end
I would recommend that you think seriously about how you wish to manage your keys for this backup process. If you are backing up encrypted data then you need to backup your keys or else risk losing access to your data. Secure key management is beyond the scope of this document.
Delivery
Action
Delivery is supported via scp
, ftp
, and mv
Full FTP support is unfinished. Currently, only scp
and mv
fully support rotation
action :deliver, :method => :scp
action :deliver, :method => :ftp
action :deliver, :method => :mv
:mv
action is defined (in backup/recipes/standard.rb
) like any user-defined action:
action(:mv) do
sh "mv #{last_result} #{c[:backup_path]}/"
c[:backup_path] @ "/" @ File.basename(last_result)
end
Variables
Name | Desc. | Example |
---|---|---|
:servers |
An array of host names to deliver the data to. TODO this currently supports one 1 server. |
set :servers, %w{ localhost } |
:ssh_user |
The name of the ssh user on the foreign server. Default ENV[‘USER’]. | set :ssh_user, ENV['USER'] |
:identity_key |
The path to the key to use when ssh’ing into a foreign server. | set :identity_key, ENV['HOME'] + "/.ssh/id_rsa" |
Rotate
Rotation of your backups is a way to keep snapshot copies of your backups in time while not keeping every single backup for every single day. Currently the only form of rotation Backup supports is grandfather-father-son. See Wikipedia if you are unfamiliar with how this works.
set :rotation_method, :gfs # this is the default. you don't need to set it and there are no other supported options
By deafult, a son
is created daily, unless it is a day to create a father
or grandfather
. It is assumed that every time you run Backup you want to create a backup. Therefore, if you do not want to a son
(etc), do not run the program. You can specify when a son
is promoted to a father
by the following variable:
set :son_promoted_on, :fri
You specify when fathers are promoted to grandfathers by something like the following
set :father_promoted_on, :last_fri_of_the_month
Valid arguments for specifying these promotions are as follows:
:mon
-:sun
– A symbol of the abbreviation of any day of the week:last_[mon-sun]_of_the_month
– A symbol, replacing[mon-sun]
with the abbreviation for the day of the weeks.
Example::last_thu_of_the_month
.- Any valid
Runt
object.
Runt
. You are, therefore, allowed to pass in your own arbitrarily complex Runt
object. For example, say that you wanted to promote to father
on Monday, Wednesday and Friday. You could do the following:
mon_wed_fri = Runt::DIWeek.new(Runt::Mon) |
Runt::DIWeek.new(Runt::Wed) |
Runt::DIWeek.new(Runt::Fri)
set :son_promoted_on, mon_wed_fri
See the Runt documentation for more information on this.
You can set how many of each rank to keep:
# assuming daily backups...
set :sons_to_keep, 14 # this would be two weeks
set :fathers_to_keep, 6
set :grandfathers_to_keep, 6
Examples
Good things come in threes
Here we will cover three examples.
- a super-simple backup to a local directory. This will show how easy things can be.
- a more complex implementation. This will show some of the variables you can set and show how to use a foreign server.
- an even more complex configuration. This will show how to define your own methods and set more advanced variables.
Example One | Backup a Folder of Logs
Our first example will be backing up a folder of logs. Say we have a folder /var/my_logs/
and it is full of log files.
What we want to do is:
- move out all the old log files
- compress them and store them in a local folder
- store 2 weeks of daily backups (sons)
- store a weekly backup (father) going back 6 weeks
- and create a monthly backup on the last friday of every month (grandfather) for 6 months
set :backup_path, "/var/local/backups/my_old_logs"
set :tmp_dir, "/tmp" # this is the default so you actually dont have to specify it
action :content, :is_contents_of => "/var/my_logs"
And thats it!
Notice a couple of things here.
- Each time we
set
a variable that becomes available to the actions asc[:var]
- In this case, make sure that
:backup_path
and:tmp_dir
are writable by the user that is running the backup script.
Example Two | SQL Backup
Our second example will be backing up a MediaWiki installation. Say we have a MySQL database named mediawiki
.
This time we want to:
- create a dump of the database every day
- compress this backup and store it in a local folder
- store 2 weeks of daily backups (son) [same as last time]
- store “father” backups every Monday, Wednesday and Friday going back 6 weeks
- and create a monthly backup on the last Friday of every month (grandfather) for 6 months
set :backup_path, "/var/local/backups/mediawiki"
action(:content) do
dump = c[:tmp_dir] + "/mediawiki.sql"
sh "mysqldump -uroot mediawiki > #{dump}"
dump # make sure you return the name of the file
end
action :deliver, :method => :scp
action :rotate, :method => :via_ssh
set :servers, %w{ my.server.com }
set :son_promoted_on, :sun
set :father_promoted_on, :last_sun_of_the_month
set :sons_to_keep, 21
set :fathers_to_keep, 12
set :grandfathers_to_keep, 12
A couple things to note:
- In this example, it is assumed that the current user running the
script has
ssh
keys configured for password-less login tomy.server.com
. If you want to change the user to login as set the variable:ssh_user
. You can also specify a key with:identity_key
. - Number 1 is the only thing
Example Three | Something more complex
By now it should be easy to see what is going on here:
action :content, :is_file => "/path/to/file.abc"
action :compress, :method => :my_tar_gzip
action(:my_tar_gzip) do
name = c[:tmp_dir] + "/" + File.basename(last_result) + ".tar.gzip"
sh "tar -czv --exclude .DS* --exclude CVS #{last_result} > #{name}"
name # make sure you return the name of the tar.gzip file
end
set :encrypt, true
action :deliver, :method => :scp
action :rotate, :method => :via_ssh
set :ssh_user, "backup_user"
set :identity_key, ENV['HOME'] + "/.ssh/backup_key"
Known Bugs and Limitations
TODO
What is left to do:
- Finish FTP support
- Add support for multiple backup servers
- Add in better logging
- Remove all the
tmp_dir
references. use and test the standard librarytmpdir
- Continue testing it
If you find yourself writing the code for Backup
to fix these things, please consider submitting your patch for the benefit of the community.
BUGS
- You can’t
return
in the user-defined actions for some reason. I think this has to do with theinstance_eval
. But still, I wouldn’t think it would matter. I’d be interested in any suggestions on how to fix this. - Please submit other bugs via the trackers at the BackupGem project page.