BackupPC

From Smop.co.uk
Jump to: navigation, search

I've used three open source backup products and I'm afraid I'm not very happy with any of them:

  • amanda - one tape per day philosophy - that's enough to make it unusable for me
  • bacula - better than amanda, have had various issues (tape related normally)
    • better admin tools out now for this
  • backuppc - massively hardlinked tree causes problems with many/large hosts


What I'd like is:

  • easy to use for restores (backuppc GUI is good in that regard, new bacula interfaces look good too)
  • database for saving metadata (massively hardlinked trees causes issues)
    • how to do replicate the backup to another server?
  • de-duplication - backuppc does this
  • binary diffs - particularly of log files
  • retention policy
    • with bacula you can only purge a whole job
    • with backuppc you have to remove it yourself from the tree and await the nightly clearup
    • I want to be able to say which files to backup, for how long, which copies to keep
      • I might want some files backed up each how, others each day, others each week
      • some files I just want the most recent copy, others I might need to keep for seven years
  • hierarchal storage - e.g. disk, tape, offsite
    • preferably with higher compression and/or binary diff mechanisms.

Contents

BackupPC setup

  • apt-get install backuppc rsync libfile-rsyncp-perl
  • setup passwordless ssh on backup server
    • "backuppc" user needs to be able to login to clients
    • I normally login as "backup" and grant sudo access to rsync (NOPASSWD:)
  • add rsync details to /etc/backuppc/config.pl:
$Conf{XferMethod} = 'rsync';
$Conf{RsyncClientCmd} = '$sshPath -q -x -l backup $host sudo $rsyncPath $argList+';
$Conf{RsyncClientRestoreCmd} = '$sshPath -q -x -l backup $host sudo $rsyncPath $argList+';
  • tell backuppc when to run:
$Conf{WakeupSchedule} = [2.5,3..6];     # first number = nightly admin task
$Conf{MaxBackupPCNightlyJobs} = 2;      # run two admin tasks in parallel
$Conf{BackupPCNightlyPeriod} = 1;       # go through pool every X days (1,2,4,8,16)
$Conf{MaxBackups} = 4;                  # how many machines to backup in parallel
$Conf{BlackoutGoodCnt}=0;
  • and how long to keep backups for:
$Conf{FullKeepCnt} = [4, 2, 3];  # full backups for x*(2^position*backupperiod)
# e.g. [2, 3, 4] means keepinging weeks (1,2),(4,6,8),(12,16,20, 24)

# Very old full backups are removed after $Conf{FullAgeMax} days.  However,
# we keep at least $Conf{FullKeepCntMin} full backups no matter how old
# they are.
$Conf{FullKeepCntMin} = 1;
$Conf{FullAgeMax}     = 90;

$Conf{IncrKeepCnt} = 6;  # how many incrementals to keep
$Conf{IncrKeepCntMin} = 1;   # minimum number of incrementals to keep
$Conf{IncrAgeMax}     = 30;  # maximum age of incrementals

# how long to keep partial backups around before binning them
# these only occur when a full backup fails half way through
$Conf{PartialAgeMax} = 3;

BackupPC client config

Since you've setup ssh and sudo already, we just need to tell backuppc about the client.

There are some gotchas with backing up via rsync, so I find it easiest to define common settings in /etc/backuppc/common.pl and then override then as required.

  • first add "hostname 1 backuppc" to /etc/backuppc/hosts".
  • setup /etc/backuppc/common.pl - note the WARNING notes!
# Common hosts settings - must be explicity sourced by each host.pl file

# We have to set this here as the push below just appends
# to an empty list otherwise
$Conf{RsyncArgs} = [
            #
            # Do not edit these!
            #
            '--numeric-ids',
            '--perms',
            '--owner',
            '--group',
            '-D',
            '--links',
            '--hard-links',
            '--times',
            '--block-size=2048',
            '--recursive',
            '-C',

            #
            # Rsync >= 2.6.3 supports the --checksum-seed option
            # which allows rsync checksum caching on the server.
            # Uncomment this to enable rsync checksum caching if
            # you have a recent client rsync version and you want
            # to enable checksum caching.
            #
            #'--checksum-seed=32761',

            #
            # Add additional arguments here
            #
];

##### WARNING  #####
# rsync matches as it goes so given "+ /a/b/c, -/a/b" 
# it doesn't even look inside /a/b and hence /a/b/c is excluded!
# 
# For this reason, BackupFilesOnly has unexpected behaviour.
# e.g. BackupFilesOnly "/var/cache/debconf" adds 
# excludes for /*, /var/*, /var/cache/*
#
# To backup /var/cache/debconf:
# add '/var' to $Conf{BackupFilesOnly}
# add '--include=/var/cache/debconf' to $Conf{RsyncArgs}
# add '/var/cache/*' to $Conf{BackupFilesExclude}


$Conf{BackupFilesOnly} = ['/etc/','/boot/grub','/root/','/home',
                          '/opt', '/usr/local', '/var/'];

# Need dummy first argument since it's dropped for some reason
push @{$Conf{RsyncArgs}}, '',
                          '--include=/var/cache/debconf';

$Conf{BackupFilesExclude} = [
      '/var/db/nscd', '/var/cache/*','/var/lib/apt','/var/lib/dpkg/info',
      '/var/lock', '/var/log', '/var/run', '/var/tmp'];
  • finally import this into /etc/backuppc/hostname.pl and override as required:
# get common host settings
do "/etc/backuppc/common.pl";

# must exclude BackupPC data directory!(main backup server only)
push @{$Conf{BackupFilesExclude}}, '/var/lib/backuppc';

Useful scripts

I've written some scripts that I use to check backup disk usage:

Media:Backuppc_stats

backuppc_stats -a|-h hostname [-i|-f]

reports backup stats for all hosts or a single host.
will report last five incrementals [-i or default]
or last five full backups [-f]

Sample usage (backuppc_stats -a -f):

Mins  Files  Size  Comp
  5    314    24     6 ash
  6    658    54    11 ash
  3     76     7     2 ash
Mins  Files  Size  Comp
  0     30     0     0 ripley
  0     88    28    24 ripley
  0     17     0     0 ripley

Media:Backuppc_new

backuppc_new -a|-h hostname [-s size] [-b last|full|incr|number]

reports new files (above a given size. default=1048576)
in a given backup (defaults to "last") 

Sample output (backuppc_new -h vasquez):

vasquez
=========
 create   644   106/110     1453401 var/backups/clamav/phish.ndb
 create   644   106/110     1502482 var/backups/clamav/scam.ndb
 create   600   107/111     5218304 var/lib/amavis/.spamassassin/bayes_seen
 create   660   102/106     5242880 var/lib/mysql/ib_logfile0
 create   600   107/111     5607424 var/lib/amavis/.spamassassin/bayes_toks
 create   660   102/106     8541184 var/lib/mysql/gld/greylist.MYI
 create   600   107/111    10604544 var/lib/amavis/.spamassassin/auto-whitelist
 create   660   102/106    18874368 var/lib/mysql/ibdata1
 create   660   102/106    56762937 var/lib/mysql/gld/greylist.MYD

Media:Backuppc_ls

backuppc_ls -a|-h hostname path

returns a list of matching files for the given hosts

Sample output (backuppc_ls -a "/var/lib/mysql/gld/grey*"):

/var/lib/backuppc/pc/vasquez/112/f%2f/fvar/flib/fmysql/fgld/fgreylist.MYD
/var/lib/backuppc/pc/vasquez/112/f%2f/fvar/flib/fmysql/fgld/fgreylist.MYI
/var/lib/backuppc/pc/vasquez/112/f%2f/fvar/flib/fmysql/fgld/fgreylist.frm
/var/lib/backuppc/pc/vasquez/124/f%2f/fvar/flib/fmysql/fgld/fgreylist.MYD
/var/lib/backuppc/pc/vasquez/124/f%2f/fvar/flib/fmysql/fgld/fgreylist.MYI
/var/lib/backuppc/pc/vasquez/124/f%2f/fvar/flib/fmysql/fgld/fgreylist.frm

I normally feed this to a script using xargs.

Media:Backuppc_prune

This script removes selected old files from backuppc.

I use it for dropping large files earlier (e.g. database backups) before small files which I can keep for longer.

The files to delete are currently listed at the bottom of the script but should be moved into a configuration file.

Personal tools