[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
GNU tar
is distributed along with the scripts for performing backups
and restores. Even if there is a good chance those scripts may be
satisfying to you, they are not the only scripts or methods available for doing
backups and restore. You may well create your own, or use more
sophisticated packages dedicated to that purpose.
Some users are enthusiastic about Amanda
(The Advanced Maryland
Automatic Network Disk Archiver), a backup system developed by James
da Silva ‘jds@cs.umd.edu’ and available on many Unix systems.
This is free software, and it is available from http://www.amanda.org.
This chapter documents both the provided shell scripts and tar
options which are more specific to usage as a backup tool.
To back up a file system means to create archives that contain all the files in that file system. Those archives can then be used to restore any or all of those files (for instance if a disk crashes or a file is accidentally deleted). File system backups are also called dumps.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
tar
to Perform Full Dumps(This message will disappear, once this node revised.)
Full dumps should only be made when no other people or programs
are modifying files in the file system. If files are modified while
tar
is making the backup, they may not be stored properly in
the archive, in which case you won’t be able to restore them if you
have to. (Files not being modified are written with no trouble, and do
not corrupt the entire archive.)
You will want to use the ‘--label=archive-label’ (‘-V archive-label’) option to give the archive a volume label, so you can tell what this archive is even if the label falls off the tape, or anything like that.
Unless the file system you are dumping is guaranteed to fit on one volume, you will need to use the ‘--multi-volume’ (‘-M’) option. Make sure you have enough tapes on hand to complete the backup.
If you want to dump each file system separately you will need to use
the ‘--one-file-system’ option to prevent
tar
from crossing file system boundaries when storing
(sub)directories.
The ‘--incremental’ (‘-G’) (see section Using tar
to Perform Incremental Dumps)
option is not needed, since this is a complete copy of everything in
the file system, and a full restore from this backup would only be
done onto a completely
empty disk.
Unless you are in a hurry, and trust the tar
program (and your
tapes), it is a good idea to use the ‘--verify’ (‘-W’)
option, to make sure your files really made it onto the dump properly.
This will also detect cases where the file was modified while (or just
after) it was being archived. Not all media (notably cartridge tapes)
are capable of being verified, unfortunately.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
tar
to Perform Incremental DumpsIncremental backup is a special form of GNU tar
archive that
stores additional metadata so that exact state of the file system
can be restored when extracting the archive.
GNU tar
currently offers two options for handling incremental
backups: ‘--listed-incremental=snapshot-file’ (‘-g
snapshot-file’) and ‘--incremental’ (‘-G’).
The option ‘--listed-incremental’ instructs tar to operate on an incremental archive with additional metadata stored in a standalone file, called a snapshot file. The purpose of this file is to help determine which files have been changed, added or deleted since the last backup, so that the next incremental backup will contain only modified files. The name of the snapshot file is given as an argument to the option:
Handle incremental backups with snapshot data in file.
To create an incremental backup, you would use ‘--listed-incremental’ together with ‘--create’ (see section How to Create Archives). For example:
$ tar --create \ --file=archive.1.tar \ --listed-incremental=/var/log/usr.snar \ /usr
This will create in ‘archive.1.tar’ an incremental backup of the ‘/usr’ file system, storing additional metadata in the file ‘/var/log/usr.snar’. If this file does not exist, it will be created. The created archive will then be a level 0 backup; please see the next section for more on backup levels.
Otherwise, if the file ‘/var/log/usr.snar’ exists, it determines which files are modified. In this case only these files will be stored in the archive. Suppose, for example, that after running the above command, you delete file ‘/usr/doc/old’ and create directory ‘/usr/local/db’ with the following contents:
$ ls /usr/local/db /usr/local/db/data /usr/local/db/index
Some time later you create another incremental backup. You will then see:
$ tar --create \ --file=archive.2.tar \ --listed-incremental=/var/log/usr.snar \ /usr tar: usr/local/db: Directory is new usr/local/db/ usr/local/db/data usr/local/db/index
The created archive ‘archive.2.tar’ will contain only these
three members. This archive is called a level 1 backup. Notice
that ‘/var/log/usr.snar’ will be updated with the new data, so if
you plan to create more ‘level 1’ backups, it is necessary to
create a working copy of the snapshot file before running
tar
. The above example will then be modified as follows:
$ cp /var/log/usr.snar /var/log/usr.snar-1 $ tar --create \ --file=archive.2.tar \ --listed-incremental=/var/log/usr.snar-1 \ /usr
You can force ‘level 0’ backups either by removing the snapshot
file before running tar
, or by supplying the
‘--level=0’ option, e.g.:
$ tar --create \ --file=archive.2.tar \ --listed-incremental=/var/log/usr.snar-0 \ --level=0 \ /usr
Incremental dumps depend crucially on time stamps, so the results are unreliable if you modify a file’s time stamps during dumping (e.g., with the ‘--atime-preserve=replace’ option), or if you set the clock backwards.
Metadata stored in snapshot files include device numbers, which, obviously are supposed to be non-volatile values. However, it turns out that NFS devices have undependable values when an automounter gets in the picture. This can lead to a great deal of spurious redumping in incremental dumps, so it is somewhat useless to compare two NFS devices numbers over time. The solution implemented currently is to consider all NFS devices as being equal when it comes to comparing directories; this is fairly gross, but there does not seem to be a better way to go.
Apart from using NFS, there are a number of cases where relying on device numbers can cause spurious redumping of unmodified files. For example, this occurs when archiving LVM snapshot volumes. To avoid this, use ‘--no-check-device’ option:
Do not rely on device numbers when preparing a list of changed files for an incremental dump.
Use device numbers when preparing a list of changed files
for an incremental dump. This is the default behavior. The purpose
of this option is to undo the effect of the ‘--no-check-device’
if it was given in TAR_OPTIONS
environment variable
(see TAR_OPTIONS).
There is also another way to cope with changing device numbers. It is described in detail in Fixing Snapshot Files.
Note that incremental archives use tar
extensions and may
not be readable by non-GNU versions of the tar
program.
To extract from the incremental dumps, use
‘--listed-incremental’ together with ‘--extract’
option (see section Extracting Specific Files). In this case, tar
does
not need to access snapshot file, since all the data necessary for
extraction are stored in the archive itself. So, when extracting, you
can give whatever argument to ‘--listed-incremental’, the usual
practice is to use ‘--listed-incremental=/dev/null’.
Alternatively, you can use ‘--incremental’, which needs no
arguments. In general, ‘--incremental’ (‘-G’) can be
used as a shortcut for ‘--listed-incremental’ when listing or
extracting incremental backups (for more information regarding this
option, see incremental-op).
When extracting from the incremental backup GNU tar
attempts to
restore the exact state the file system had when the archive was
created. In particular, it will delete those files in the file
system that did not exist in their directories when the archive was
created. If you have created several levels of incremental files,
then in order to restore the exact contents the file system had when
the last level was created, you will need to restore from all backups
in turn. Continuing our example, to restore the state of ‘/usr’
file system, one would do(12):
$ tar --extract \ --listed-incremental=/dev/null \ --file archive.1.tar $ tar --extract \ --listed-incremental=/dev/null \ --file archive.2.tar
To list the contents of an incremental archive, use ‘--list’ (see section How to List Archives), as usual. To obtain more information about the archive, use ‘--listed-incremental’ or ‘--incremental’ combined with two ‘--verbose’ options(13):
tar --list --incremental --verbose --verbose --file archive.tar
This command will print, for each directory in the archive, the list of files in that directory at the time the archive was created. This information is put out in a format which is both human-readable and unambiguous for a program: each file name is printed as
x file
where x is a letter describing the status of the file: ‘Y’ if the file is present in the archive, ‘N’ if the file is not included in the archive, or a ‘D’ if the file is a directory (and is included in the archive). See section Dumpdir, for the detailed description of dumpdirs and status codes. Each such line is terminated by a newline character. The last line is followed by an additional newline to indicate the end of the data.
The option ‘--incremental’ (‘-G’) gives the same behavior as ‘--listed-incremental’ when used with ‘--list’ and ‘--extract’ options. When used with ‘--create’ option, it creates an incremental archive without creating snapshot file. Thus, it is impossible to create several levels of incremental backups with ‘--incremental’ option.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
An archive containing all the files in the file system is called a full backup or full dump. You could insure your data by creating a full dump every day. This strategy, however, would waste a substantial amount of archive media and user time, as unchanged files are daily re-archived.
It is more efficient to do a full dump only occasionally. To back up files between full dumps, you can use incremental dumps. A level one dump archives all the files that have changed since the last full dump.
A typical dump strategy would be to perform a full dump once a week, and a level one dump once a day. This means some versions of files will in fact be archived more than once, but this dump strategy makes it possible to restore a file system to within one day of accuracy by only extracting two archives—the last weekly (full) dump and the last daily (level one) dump. The only information lost would be in files changed or created since the last daily backup. (Doing dumps more than once a day is usually not worth the trouble.)
GNU tar
comes with scripts you can use to do full
and level-one (actually, even level-two and so on) dumps. Using
scripts (shell programs) to perform backups and restoration is a
convenient and reliable alternative to typing out file name lists
and tar
commands by hand.
Before you use these scripts, you need to edit the file ‘backup-specs’, which specifies parameters used by the backup scripts and by the restore script. This file is usually located in ‘/etc/backup’ directory. See section Setting Parameters for Backups and Restoration, for its detailed description. Once the backup parameters are set, you can perform backups or restoration by running the appropriate script.
The name of the backup script is backup
. The name of the
restore script is restore
. The following sections describe
their use in detail.
Please Note: The backup and restoration scripts are
designed to be used together. While it is possible to restore files by
hand from an archive which was created using a backup script, and to create
an archive by hand which could then be extracted using the restore script,
it is easier to use the scripts. See section Using tar
to Perform Incremental Dumps, before
making such an attempt.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The file ‘backup-specs’ specifies backup parameters for the
backup and restoration scripts provided with tar
. You must
edit ‘backup-specs’ to fit your system configuration and schedule
before using these scripts.
Syntactically, ‘backup-specs’ is a shell script, containing
mainly variable assignments. However, any valid shell construct
is allowed in this file. Particularly, you may wish to define
functions within that script (e.g., see RESTORE_BEGIN
below).
For more information about shell script syntax, please refer to
the definition of the Shell Command Language. See also
Bash Features in Bash Reference Manual.
The shell variables controlling behavior of backup
and
restore
are described in the following subsections.
5.4.1 General-Purpose Variables | ||
5.4.2 Magnetic Tape Control | ||
5.4.3 User Hooks | ||
5.4.4 An Example Text of ‘Backup-specs’ |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The user name of the backup administrator. Backup
scripts
sends a backup report to this address.
The hour at which the backups are done. This can be a number from 0 to 23, or the time specification in form hours:minutes, or the string ‘now’.
This variable is used by backup
. Its value may be overridden
using ‘--time’ option (see section Using the Backup Scripts).
The device tar
writes the archive to. If TAPE_FILE
is a remote archive (see remote-dev), backup script will suppose
that your mt
is able to access remote devices. If RSH
(see RSH) is set, ‘--rsh-command’ option will be added to
invocations of mt
.
The blocking factor tar
will use when writing the dump archive.
See section The Blocking Factor of an Archive.
A list of file systems to be dumped (for backup
), or restored
(for restore
). You can include any directory
name in the list — subdirectories on that file system will be
included, regardless of how they may look to other networked machines.
Subdirectories on other file systems will be ignored.
The host name specifies which host to run tar
on, and should
normally be the host that actually contains the file system. However,
the host machine must have GNU tar
installed, and
must be able to access the directory containing the backup scripts and
their support files using the same file name that is used on the
machine where the scripts are run (i.e., what pwd
will print
when in that directory on that machine). If the host that contains
the file system does not have this capability, you can specify another
host as long as it can access the file system through NFS.
If the list of file systems is very long you may wish to put it
in a separate file. This file is usually named
‘/etc/backup/dirs’, but this name may be overridden in
‘backup-specs’ using DIRLIST
variable.
The name of the file that contains a list of file systems to backup or restore. By default it is ‘/etc/backup/dirs’.
A list of individual files to be dumped (for backup
), or restored
(for restore
). These should be accessible from the machine on
which the backup script is run.
If the list of individual files is very long you may wish to store it
in a separate file. This file is usually named
‘/etc/backup/files’, but this name may be overridden in
‘backup-specs’ using FILELIST
variable.
The name of the file that contains a list of individual files to backup or restore. By default it is ‘/etc/backup/files’.
Full file name of mt
binary.
Full file name of rsh
binary or its equivalent. You may wish to
set it to ssh
, to improve security. In this case you will have
to use public key authentication.
Full file name of rsh
binary on remote machines. This will
be passed via ‘--rsh-command’ option to the remote invocation
of GNU tar
.
Name of temporary file to hold volume numbers. This needs to be accessible by all the machines which have file systems to be dumped.
Name of exclude file list. An exclude file list is a file located on the remote machine and containing the list of files to be excluded from the backup. Exclude file lists are searched in /etc/tar-backup directory. A common use for exclude file lists is to exclude files containing security-sensitive information (e.g., ‘/etc/shadow’ from backups).
This variable affects only backup
.
Time to sleep between dumps of any two successive file systems
This variable affects only backup
.
Script to be run when it’s time to insert a new tape in for the next
volume. Administrators may want to tailor this script for their site.
If this variable isn’t set, GNU tar
will display its built-in
prompt, and will expect confirmation from the console. For the
description of the default prompt, see change volume prompt.
Message to display on the terminal while waiting for dump time. Usually this will just be some literal text.
Full file name of the GNU tar
executable. If this is not set, backup
scripts will search tar
in the current shell path.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Backup scripts access tape device using special hook functions. These functions take a single argument — the name of the tape device. Their names are kept in the following variables:
The name of begin function. This function is called before accessing the drive. By default it retensions the tape:
MT_BEGIN=mt_begin mt_begin() { mt -f "$1" retension }
The name of rewind function. The default definition is as follows:
MT_REWIND=mt_rewind mt_rewind() { mt -f "$1" rewind }
The name of the function switching the tape off line. By default it is defined as follows:
MT_OFFLINE=mt_offline mt_offline() { mt -f "$1" offl }
The name of the function used to obtain the status of the archive device, including error count. Default definition:
MT_STATUS=mt_status mt_status() { mt -f "$1" status }
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
User hooks are shell functions executed before and after
each tar
invocation. Thus, there are backup
hooks, which are executed before and after dumping each file
system, and restore hooks, executed before and
after restoring a file system. Each user hook is a shell function
taking four arguments:
Its arguments are:
Current backup or restore level.
Name or IP address of the host machine being dumped or restored.
Full file name of the file system being dumped or restored.
File system name with directory separators replaced with colons. This is useful, e.g., for creating unique files.
Following variables keep the names of user hook functions:
Dump begin function. It is executed before dumping the file system.
Executed after dumping the file system.
Executed before restoring the file system.
Executed after restoring the file system.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The following is an example of ‘backup-specs’:
# site-specific parameters for file system backup. ADMINISTRATOR=friedman BACKUP_HOUR=1 TAPE_FILE=/dev/nrsmt0 # Usessh
instead of the less securersh
RSH=/usr/bin/ssh RSH_COMMAND=/usr/bin/ssh # Override MT_STATUS function: my_status() { mts -t $TAPE_FILE } MT_STATUS=my_status # Disable MT_OFFLINE function MT_OFFLINE=: BLOCKING=124 BACKUP_DIRS=" albert:/fs/fsf apple-gunkies:/gd albert:/fs/gd2 albert:/fs/gp geech:/usr/jla churchy:/usr/roland albert:/ albert:/usr apple-gunkies:/ apple-gunkies:/usr gnu:/hack gnu:/u apple-gunkies:/com/mailer/gnu apple-gunkies:/com/archive/gnu" BACKUP_FILES="/com/mailer/aliases /com/mailer/league*[a-z]"
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The syntax for running a backup script is:
backup --level=level --time=time
The ‘--level’ option requests the dump level. Thus, to produce
a full dump, specify --level=0
(this is the default, so
‘--level’ may be omitted if its value is
0
)(14).
The ‘--time’ option determines when should the backup be run. Time may take three forms:
The dump must be run at hh hours mm minutes.
The dump must be run at hh hours.
The dump must be run immediately.
You should start a script with a tape or disk mounted. Once you
start a script, it prompts you for new tapes or disks as it
needs them. Media volumes don’t have to correspond to archive
files — a multi-volume archive can be started in the middle of a
tape that already contains the end of another multi-volume archive.
The restore
script prompts for media by its archive volume,
so to avoid an error message you should keep track of which tape
(or disk) contains which volume of the archive (see section Using the Restore Script).
The backup scripts write two files on the file system. The first is a record file in ‘/etc/tar-backup/’, which is used by the scripts to store and retrieve information about which files were dumped. This file is not meant to be read by humans, and should not be deleted by them. See section Format of the Incremental Snapshot Files, for a more detailed explanation of this file.
The second file is a log file containing the names of the file systems and files dumped, what time the backup was made, and any error messages that were generated, as well as how much space was left in the media volume after the last volume of the archive was written. You should check this log file after every backup. The file name is ‘log-mm-dd-yyyy-level-n’, where mm-dd-yyyy represents current date, and n represents current dump level number.
The script also prints the name of each system being dumped to the standard output.
Following is the full list of options accepted by backup
script:
Do backup level level (default 0).
Force backup even if today’s log file already exists.
Set verbosity level. The higher the level is, the more debugging information will be output during execution. Default level is 100, which means the highest debugging level.
Wait till time, then do backup.
Display short help message and exit.
Display information about the program’s name, version, origin and legal status, all on standard output, and then exit successfully.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To restore files that were archived using a scripted backup, use the
restore
script. Its usage is quite straightforward. In the
simplest form, invoke restore --all
, it will
then restore all the file systems and files specified in
‘backup-specs’ (see section BACKUP_DIRS).
You may select the file systems (and/or files) to restore by
giving restore
a list of patterns in its command
line. For example, running
restore 'albert:*'
will restore all file systems on the machine ‘albert’. A more complicated example:
restore 'albert:*' '*:/var'
This command will restore all file systems on the machine ‘albert’ as well as ‘/var’ file system on all machines.
By default restore
will start restoring files from the lowest
available dump level (usually zero) and will continue through
all available dump levels. There may be situations where such a
thorough restore is not necessary. For example, you may wish to
restore only files from the recent level one backup. To do so,
use ‘--level’ option, as shown in the example below:
restore --level=1
The full list of options accepted by restore
follows:
Restore all file systems and files specified in ‘backup-specs’.
Start restoring from the given backup level, instead of the default 0.
Set verbosity level. The higher the level is, the more debugging information will be output during execution. Default level is 100, which means the highest debugging level.
Display short help message and exit.
Display information about the program’s name, version, origin and legal status, all on standard output, and then exit successfully.
You should start the restore script with the media containing the first volume of the archive mounted. The script will prompt for other volumes as they are needed. If the archive is on tape, you don’t need to rewind the tape to to its beginning—if the tape head is positioned past the beginning of the archive, the script will rewind the tape as needed. See section Tape Positions and Tape Marks, for a discussion of tape positioning.
Warning: The script will delete files from the active file system if they were not in the file system when the archive was made.
See section Using tar
to Perform Incremental Dumps, for an explanation of how the script makes
that determination.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated on August 23, 2023 using texi2html 5.0.