[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
tar
Certain options to tar
enable you to specify a name for your
archive. Other options let you decide which files to include or exclude
from the archive, based on when or whether files were modified, whether
the file names do or don’t match specified patterns, or whether files
are in specified directories.
This chapter discusses these options in detail.
6.1 Choosing and Naming Archive Files | Choosing the Archive’s Name | |
6.2 Selecting Archive Members | ||
6.3 Reading Names from a File | ||
6.4 Excluding Some Files | ||
6.5 Wildcards Patterns and Matching | ||
6.6 Quoting Member Names | Ways of Quoting Special Characters in Names | |
6.7 Modifying File and Member Names | ||
6.8 Operating Only on New Files | ||
6.9 Descending into Directories | ||
6.10 Crossing File System Boundaries |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
By default, tar
uses an archive file name that was compiled when
it was built on the system; usually this name refers to some physical
tape drive on the machine. However, the person who installed tar
on the system may not have set the default to a meaningful value as far as
most users are concerned. As a result, you will usually want to tell
tar
where to find (or create) the archive. The
‘--file=archive-name’ (‘-f archive-name’)
option allows you to either specify or name a file to use as the archive
instead of the default archive file location.
Name the archive to create or operate on. Use in conjunction with any operation.
For example, in this tar
command,
$ tar -cvf collection.tar blues folk jazz
‘collection.tar’ is the name of the archive. It must directly
follow the ‘-f’ option, since whatever directly follows ‘-f’
will end up naming the archive. If you neglect to specify an
archive name, you may end up overwriting a file in the working directory
with the archive you create since tar
will use this file’s name
for the archive name.
An archive can be saved as a file in the file system, sent through a pipe or over a network, or written to an I/O device such as a tape, floppy disk, or CD write drive.
If you do not name the archive, tar
uses the value of the
environment variable TAPE
as the file name for the archive. If
that is not available, tar
uses a default, compiled-in archive
name, usually that for tape unit zero (i.e., ‘/dev/tu00’).
If you use ‘-’ as an archive-name, tar
reads the
archive from standard input (when listing or extracting files), or
writes it to standard output (when creating an archive). If you use
‘-’ as an archive-name when modifying an archive,
tar
reads the original archive from its standard input and
writes the entire new archive to its standard output.
The following example is a convenient way of copying directory hierarchy from ‘sourcedir’ to ‘targetdir’.
$ (cd sourcedir; tar -cf - .) | (cd targetdir; tar -xpf -)
The ‘-C’ option allows to avoid using subshells:
$ tar -C sourcedir -cf - . | tar -C targetdir -xpf -
In both examples above, the leftmost tar
invocation archives
the contents of ‘sourcedir’ to the standard output, while the
rightmost one reads this archive from its standard input and
extracts it. The ‘-p’ option tells it to restore permissions
of the extracted files.
To specify an archive file on a device attached to a remote machine, use the following:
--file=hostname:/dev/file-name
tar
will set up the remote connection, if possible, and
prompt you for a username and password. If you use
‘--file=@hostname:/dev/file-name’, tar
will attempt to set up the remote connection using your username
as the username on the remote machine.
If the archive file name includes a colon (‘:’), then it is assumed
to be a file on another machine. If the archive file is
‘user@host:file’, then file is used on the
host host. The remote host is accessed using the rsh
program, with a username of user. If the username is omitted
(along with the ‘@’ sign), then your user name will be used.
(This is the normal rsh
behavior.) It is necessary for the
remote machine, in addition to permitting your rsh
access, to
have the ‘rmt’ program installed (this command is included in
the GNU tar
distribution and by default is installed under
‘prefix/libexec/rmt’, where prefix means your
installation prefix). If you need to use a file whose name includes a
colon, then the remote tape drive behavior
can be inhibited by using the ‘--force-local’ option.
When the archive is being created to ‘/dev/null’, GNU tar
tries to minimize input and output operations. The Amanda backup
system, when used with GNU tar
, has an initial sizing pass which
uses this feature.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
File Name arguments specify which files in the file system
tar
operates on, when creating or adding to an archive, or which
archive members tar
operates on, when reading or deleting from
an archive. See section The Five Advanced tar
Operations.
To specify file names, you can include them as the last arguments on the command line, as follows:
tar operation [option1 option2 …] [file name-1 file name-2 …]
If a file name begins with dash (‘-’), precede it with ‘--add-file’ option to prevent it from being treated as an option.
By default GNU tar
attempts to unquote each file or member
name, replacing escape sequences according to the following
table:
Escape | Replaced with |
---|---|
\a | Audible bell (ASCII 7) |
\b | Backspace (ASCII 8) |
\f | Form feed (ASCII 12) |
\n | New line (ASCII 10) |
\r | Carriage return (ASCII 13) |
\t | Horizontal tabulation (ASCII 9) |
\v | Vertical tabulation (ASCII 11) |
\? | ASCII 127 |
\n | ASCII n (n should be an octal number of up to 3 digits) |
A backslash followed by any other symbol is retained.
This default behavior is controlled by the following command line option:
Enable unquoting input file or member names (default).
Disable unquoting input file or member names.
If you specify a directory name as a file name argument, all the files
in that directory are operated on by tar
.
If you do not specify files, tar
behavior differs depending
on the operation mode as described below:
When tar
is invoked with ‘--create’ (‘-c’),
tar
will stop immediately, reporting the following:
$ tar cf a.tar tar: Cowardly refusing to create an empty archive Try 'tar --help' or 'tar --usage' for more information.
If you specify either ‘--list’ (‘-t’) or
‘--extract’ (‘--get’, ‘-x’), tar
operates on all the archive members in the archive.
If run with ‘--diff’ option, tar will compare the archive with the contents of the current working directory.
If you specify any other operation, tar
does nothing.
By default, tar
takes file names from the command line. However,
there are other ways to specify file or member names, or to modify the
manner in which tar
selects the files or members upon which to
operate. In general, these methods work both for specifying the names
of files and archive members.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Instead of giving the names of files or archive members on the command
line, you can put the names into a file, and then use the
‘--files-from=file-of-names’ (‘-T
file-of-names’) option to tar
. Give the name of the
file which contains the list of files to include as the argument to
‘--files-from’. In the list, the file names should be separated by
newlines. You will frequently use this option when you have generated
the list of files to archive with the find
utility.
Get names to extract or create from file file-name.
If you give a single dash as a file name for ‘--files-from’, (i.e.,
you specify either --files-from=-
or -T -
), then the file
names are read from standard input.
Unless you are running tar
with ‘--create’, you cannot use
both --files-from=-
and --file=-
(-f -
) in the same
command.
Any number of ‘-T’ options can be given in the command line.
The following example shows how to use find
to generate a list of
files smaller than 400 blocks in length(15) and put that list into a file
called ‘small-files’. You can then use the ‘-T’ option to
tar
to specify the files from that file, ‘small-files’, to
create the archive ‘little.tgz’. (The ‘-z’ option to
tar
compresses the archive with gzip
; see section Creating and Reading Compressed Archives for
more information.)
$ find . -size -400 -print > small-files $ tar -c -v -z -T small-files -f little.tgz
By default, each line read from the file list is first stripped off
any leading and trailing whitespace. If the resulting string begins
with ‘-’ character, it is considered a tar
option and is
processed accordingly(16). Only a
subset of GNU tar
options is allowed for use in file lists. For
a list of such options, Position-Sensitive Options.
For example, the common use of this feature is to change to another directory by specifying ‘-C’ option:
$ cat list -C/etc passwd hosts -C/lib libc.a $ tar -c -f foo.tar --files-from list
In this example, tar
will first switch to ‘/etc’
directory and add files ‘passwd’ and ‘hosts’ to the
archive. Then it will change to ‘/lib’ directory and will archive
the file ‘libc.a’. Thus, the resulting archive ‘foo.tar’ will
contain:
$ tar tf foo.tar passwd hosts libc.a
Note, that any options used in the file list remain in effect for the rest of the command line. For example, using the same ‘list’ file as above, the following command
$ tar -c -f foo.tar --files-from list libcurses.a
will look for file ‘libcurses.a’ in the directory ‘/lib’, because it was used with the last ‘-C’ option (see section Position-Sensitive Options).
If such option handling is undesirable, use the ‘--verbatim-files-from’ option. When this option is in effect, each line read from the file list is treated as a file name. Notice, that this means, in particular, that no whitespace trimming is performed.
The ‘--verbatim-files-from’ affects all ‘-T’ options that follow it in the command line. The default behavior can be restored using ‘--no-verbatim-files-from’ option.
To disable option handling for a single file name, use the
‘--add-file’ option, e.g.: --add-file=--my-file
.
You can use any GNU tar
command line options in the file list file,
including ‘--files-from’ option itself. This allows for
including contents of a file list into another file list file.
Note however, that options that control file list processing, such as
‘--verbatim-files-from’ or ‘--null’ won’t affect the
file they appear in. They will affect next ‘--files-from’
option, if there is any.
6.3.1 NUL -Terminated File Names |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
NUL
-Terminated File NamesThe ‘--null’ option causes
‘--files-from=file-of-names’ (‘-T file-of-names’)
to read file names terminated by a NUL
instead of a newline, so
files whose names contain newlines can be archived using
‘--files-from’.
Only consider NUL
-terminated file names, instead of files that
terminate in a newline.
Undo the effect of any previous ‘--null’ option.
The ‘--null’ option is just like the one in GNU
xargs
and cpio
, and is useful with the
‘-print0’ predicate of GNU find
. In
tar
, ‘--null’ also disables special handling for
file names that begin with dash (similar to
‘--verbatim-files-from’ option).
This example shows how to use find
to generate a list of files
larger than 800 blocks in length and put that list into a file called
‘long-files’. The ‘-print0’ option to find
is just
like ‘-print’, except that it separates files with a NUL
rather than with a newline. You can then run tar
with both the
‘--null’ and ‘-T’ options to specify that tar
gets the
files from that file, ‘long-files’, to create the archive
‘big.tgz’. The ‘--null’ option to tar
will cause
tar
to recognize the NUL
separator between files.
$ find . -size +800 -print0 > long-files $ tar -c -v --null --files-from=long-files --file=big.tar
The ‘--no-null’ option can be used if you need to read both
NUL
-terminated and newline-terminated files on the same command line.
For example, if ‘flist’ is a newline-terminated file, then the
following command can be used to combine it with the above command:
$ find . -size +800 -print0 | tar -c -f big.tar --null -T - --no-null -T flist
This example uses short options for typographic reasons, to avoid very long lines.
GNU tar
is tries to automatically detect NUL
-terminated file
lists, so in many cases it is safe to use them even without the
‘--null’ option. In this case tar
will print a
warning and continue reading such a file as if ‘--null’ were
actually given:
$ find . -size +800 -print0 | tar -c -f big.tar -T - tar: -: file name read contains nul character
The null terminator, however, remains in effect only for this particular file, any following ‘-T’ options will assume newline termination. Of course, the null autodetection applies to these eventual surplus ‘-T’ options as well.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To avoid operating on files whose names match a particular pattern, use the ‘--exclude’ or ‘--exclude-from’ options.
Causes tar
to ignore files that match the pattern.
The ‘--exclude=pattern’ option prevents any file or member whose name matches the shell wildcard (pattern) from being operated on. For example, to create an archive with all the contents of the directory ‘src’ except for files whose names end in ‘.o’, use the command ‘tar -cf src.tar --exclude='*.o' src’.
You may give multiple ‘--exclude’ options.
Causes tar
to ignore files that match the patterns listed in
file.
Use the ‘--exclude-from’ option to read a
list of patterns, one per line, from file; tar
will
ignore files matching those patterns. Thus if tar
is
called as ‘tar -c -X foo .’ and the file ‘foo’ contains a
single line ‘*.o’, no files whose names end in ‘.o’ will be
added to the archive.
Notice, that lines from file are read verbatim. One of the frequent errors is leaving some extra whitespace after a file name, which is difficult to catch using text editors.
However, empty lines are OK.
When archiving directories that are under some version control system (VCS), it is often convenient to read exclusion patterns from this VCS’ ignore files (e.g. ‘.cvsignore’, ‘.gitignore’, etc.) The following options provide such possibility:
Before archiving a directory, see if it contains any of the following files: ‘cvsignore’, ‘.gitignore’, ‘.bzrignore’, or ‘.hgignore’. If so, read ignore patterns from these files.
The patterns are treated much as the corresponding VCS would treat them, i.e.:
Contains shell-style globbing patterns that apply only to the directory where this file resides. No comments are allowed in the file. Empty lines are ignored.
Contains shell-style globbing patterns. Applies to the directory where ‘.gitfile’ is located and all its subdirectories.
Any line beginning with a ‘#’ is a comment. Backslash escapes the comment character.
Contains shell globbing-patterns and regular expressions (if prefixed with ‘RE:’(17). Patterns affect the directory and all its subdirectories.
Any line beginning with a ‘#’ is a comment.
Contains POSIX regular expressions(18). The line ‘syntax: glob’ switches to shell globbing patterns. The line ‘syntax: regexp’ switches back. Comments begin with a ‘#’. Patterns affect the directory and all its subdirectories.
Before dumping a directory, tar
checks if it contains
file. If so, exclusion patterns are read from this file.
The patterns affect only the directory itself.
Same as ‘--exclude-ignore’, except that the patterns read affect both the directory where file resides and all its subdirectories.
Exclude files and directories used by following version control systems: ‘CVS’, ‘RCS’, ‘SCCS’, ‘SVN’, ‘Arch’, ‘Bazaar’, ‘Mercurial’, and ‘Darcs’.
As of version 1.35, the following files are excluded:
Exclude backup and lock files. This option causes exclusion of files that match the following shell globbing patterns:
When creating an archive, the ‘--exclude-caches’ option family
causes tar
to exclude all directories that contain a cache
directory tag. A cache directory tag is a short file with the
well-known name ‘CACHEDIR.TAG’ and having a standard header
specified in http://www.brynosaurus.com/cachedir/spec.html.
Various applications write cache directory tags into directories they
use to hold regenerable, non-precious data, so that such data can be
more easily excluded from backups.
There are three ‘exclude-caches’ options, each providing a different exclusion semantics:
Do not archive the contents of the directory, but archive the directory itself and the ‘CACHEDIR.TAG’ file.
Do not archive the contents of the directory, nor the ‘CACHEDIR.TAG’ file, archive only the directory itself.
Omit directories containing ‘CACHEDIR.TAG’ file entirely.
Another option family, ‘--exclude-tag’, provides a generalization of this concept. It takes a single argument, a file name to look for. Any directory that contains this file will be excluded from the dump. Similarly to ‘exclude-caches’, there are three options in this option family:
Do not dump the contents of the directory, but dump the directory itself and the file.
Do not dump the contents of the directory, nor the file, archive only the directory itself.
Omit directories containing file file entirely.
Multiple ‘--exclude-tag*’ options can be given.
For example, given this directory:
$ find dir dir dir/blues dir/jazz dir/folk dir/folk/tagfile dir/folk/sanjuan dir/folk/trote
The ‘--exclude-tag’ will produce the following:
$ tar -cf archive.tar --exclude-tag=tagfile -v dir dir/ dir/blues dir/jazz dir/folk/ tar: dir/folk/: contains a cache directory tag tagfile; contents not dumped dir/folk/tagfile
Both the ‘dir/folk’ directory and its tagfile are preserved in the archive, however the rest of files in this directory are not.
Now, using the ‘--exclude-tag-under’ option will exclude ‘tagfile’ from the dump, while still preserving the directory itself, as shown in this example:
$ tar -cf archive.tar --exclude-tag-under=tagfile -v dir dir/ dir/blues dir/jazz dir/folk/ ./tar: dir/folk/: contains a cache directory tag tagfile; contents not dumped
Finally, using ‘--exclude-tag-all’ omits the ‘dir/folk’ directory entirely:
$ tar -cf archive.tar --exclude-tag-all=tagfile -v dir dir/ dir/blues dir/jazz ./tar: dir/folk/: contains a cache directory tag tagfile; directory not dumped
Problems with Using the exclude Options |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
exclude
OptionsSome users find ‘exclude’ options confusing. Here are some common pitfalls:
tar
does not act on a file name
explicitly listed on the command line, if one of its file name
components is excluded. In the example above, if
you create an archive and exclude files that end with ‘*.o’, but
explicitly name the file ‘dir.o/foo’ after all the options have been
listed, ‘dir.o/foo’ will be excluded from the archive.
tar
sees wildcard characters
like ‘*’. If you do not do this, the shell might expand the
‘*’ itself using files at hand, so tar
might receive a
list of files instead of one pattern, or none at all, making the
command somewhat illegal. This might not correspond to what you want.
For example, write:
$ tar -c -f archive.tar --exclude '*.o' directory
rather than:
# Wrong! $ tar -c -f archive.tar --exclude *.o directory
regexp
syntax, when using exclude options in tar
. If you try to use
regexp
syntax to describe files to be excluded, your command
might fail.
tar
, what is now the
‘--exclude-from’ option was called ‘--exclude’ instead.
Now, ‘--exclude’ applies to patterns listed on the command
line and ‘--exclude-from’ applies to patterns listed in a
file.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Globbing is the operation by which wildcard characters,
‘*’ or ‘?’ for example, are replaced and expanded into all
existing files matching the given pattern. GNU tar
can use wildcard
patterns for matching (or globbing) archive members when extracting
from or listing an archive. Wildcard patterns are also used for
verifying volume labels of tar
archives. This section has the
purpose of explaining wildcard syntax for tar
.
A pattern should be written according to shell syntax, using wildcard characters to effect globbing. Most characters in the pattern stand for themselves in the matched string, and case is significant: ‘a’ will match only ‘a’, and not ‘A’. The character ‘?’ in the pattern matches any single character in the matched string. The character ‘*’ in the pattern matches zero, one, or more single characters in the matched string. The character ‘\’ says to take the following character of the pattern literally; it is useful when one needs to match the ‘?’, ‘*’, ‘[’ or ‘\’ characters, themselves.
The character ‘[’, up to the matching ‘]’, introduces a character class. A character class is a list of acceptable characters for the next single character of the matched string. For example, ‘[abcde]’ would match any of the first five letters of the alphabet. Note that within a character class, all of the “special characters” listed above other than ‘\’ lose their special meaning; for example, ‘[-\\[*?]]’ would match any of the characters, ‘-’, ‘\’, ‘[’, ‘*’, ‘?’, or ‘]’. (Due to parsing constraints, the characters ‘-’ and ‘]’ must either come first or last in a character class.)
If the first character of the class after the opening ‘[’ is ‘!’ or ‘^’, then the meaning of the class is reversed. Rather than listing character to match, it lists those characters which are forbidden as the next single character of the matched string.
Other characters of the class stand for themselves. The special construction ‘[a-e]’, using an hyphen between two letters, is meant to represent all characters between a and e, inclusive.
Periods (‘.’) or forward slashes (‘/’) are not considered special for wildcard matches. However, if a pattern completely matches a directory prefix of a matched string, then it matches the full matched string: thus, excluding a directory also excludes all the files beneath it.
Controlling Pattern-Matching |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
For the purposes of this section, we call exclusion members all member names obtained while processing ‘--exclude’ and ‘--exclude-from’ options, and inclusion members those member names that were given in the command line or read from the file specified with ‘--files-from’ option.
These two pairs of member lists are used in the following operations: ‘--diff’, ‘--extract’, ‘--list’, ‘--update’.
There are no inclusion members in create mode (‘--create’ and ‘--append’), since in this mode the names obtained from the command line refer to files, not archive members.
By default, inclusion members are compared with archive members literally (19) and exclusion members are treated as globbing patterns. For example:
$ tar tf foo.tar a.c b.c a.txt [remarks] # Member names are used verbatim: $ tar -xf foo.tar -v '[remarks]' [remarks] # Exclude member names are globbed: $ tar -xf foo.tar -v --exclude '*.c' a.txt [remarks]
This behavior can be altered by using the following options:
Treat all member names as wildcards.
Treat all member names as literal strings.
Thus, to extract files whose names end in ‘.c’, you can use:
$ tar -xf foo.tar -v --wildcards '*.c' a.c b.c
Notice quoting of the pattern to prevent the shell from interpreting it.
The effect of ‘--wildcards’ option is canceled by ‘--no-wildcards’. This can be used to pass part of the command line arguments verbatim and other part as globbing patterns. For example, the following invocation:
$ tar -xf foo.tar --wildcards '*.txt' --no-wildcards '[remarks]'
instructs tar
to extract from ‘foo.tar’ all files whose
names end in ‘.txt’ and the file named ‘[remarks]’.
Normally, a pattern matches a name if an initial subsequence of the name’s components matches the pattern, where ‘*’, ‘?’, and ‘[...]’ are the usual shell wildcards, ‘\’ escapes wildcards, and wildcards can match ‘/’.
Other than optionally stripping leading ‘/’ from names (see section Absolute File Names), patterns and names are used as-is. For example, trailing ‘/’ is not trimmed from a user-specified name before deciding whether to exclude it.
However, this matching procedure can be altered by the options listed below. These options accumulate. For example:
--ignore-case --exclude='makefile' --no-ignore-case ---exclude='readme'
ignores case when excluding ‘makefile’, but not when excluding ‘readme’.
If anchored, a pattern must match an initial subsequence of the name’s components. Otherwise, the pattern can match any subsequence. Default is ‘--no-anchored’ for exclusion members and ‘--anchored’ inclusion members.
When ignoring case, upper-case patterns match lower-case names and vice versa. When not ignoring case (the default), matching is case-sensitive.
When wildcards match slash (the default for exclusion members), a wildcard like ‘*’ in the pattern can match a ‘/’ in the name. Otherwise, ‘/’ is matched only by ‘/’.
The ‘--recursion’ and ‘--no-recursion’ options (see section Descending into Directories) also affect how member patterns are interpreted. If recursion is in effect, a pattern matches a name if it matches any of the name’s parent directories.
The following table summarizes pattern-matching default values:
Members | Default settings |
---|---|
Inclusion | ‘--no-wildcards --anchored --no-wildcards-match-slash’ |
Exclusion | ‘--wildcards --no-anchored --wildcards-match-slash’ |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When displaying member names, tar
takes care to avoid
ambiguities caused by certain characters. This is called name
quoting. The characters in question are:
Character | ASCII | Character name |
---|---|---|
\a | 7 | Audible bell |
\b | 8 | Backspace |
\f | 12 | Form feed |
\n | 10 | New line |
\r | 13 | Carriage return |
\t | 9 | Horizontal tabulation |
\v | 11 | Vertical tabulation |
The exact way tar
uses to quote these characters depends on
the quoting style. The default quoting style, called
escape (see below), uses backslash notation to represent control
characters and backslash.
GNU tar
offers seven distinct quoting styles, which can be selected
using ‘--quoting-style’ option:
Sets quoting style. Valid values for style argument are: literal, shell, shell-always, c, escape, locale, clocale.
These styles are described in detail below. To illustrate their effect, we will use an imaginary tar archive ‘arch.tar’ containing the following members:
# 1. Contains horizontal tabulation character. a tab # 2. Contains newline character a newline # 3. Contains a space a space # 4. Contains double quotes a"double"quote # 5. Contains single quotes a'single'quote # 6. Contains a backslash character: a\backslash
Here is how usual ls
command would have listed them, if they
had existed in the current working directory:
$ ls a\ttab a\nnewline a\ space a"double"quote a'single'quote a\\backslash
Quoting styles:
No quoting, display each character as is:
$ tar tf arch.tar --quoting-style=literal ./ ./a space ./a'single'quote ./a"double"quote ./a\backslash ./a tab ./a newline
Display characters the same way Bourne shell does: control characters, except ‘\t’ and ‘\n’, are printed using backslash escapes, ‘\t’ and ‘\n’ are printed as is, and a single quote is printed as ‘\'’. If a name contains any quoted characters, it is enclosed in single quotes. In particular, if a name contains single quotes, it is printed as several single-quoted strings:
$ tar tf arch.tar --quoting-style=shell ./ './a space' './a'\''single'\''quote' './a"double"quote' './a\backslash' './a tab' './a newline'
Same as ‘shell’, but the names are always enclosed in single quotes:
$ tar tf arch.tar --quoting-style=shell-always './' './a space' './a'\''single'\''quote' './a"double"quote' './a\backslash' './a tab' './a newline'
Use the notation of the C programming language. All names are enclosed in double quotes. Control characters are quoted using backslash notations, double quotes are represented as ‘\"’, backslash characters are represented as ‘\\’. Single quotes and spaces are not quoted:
$ tar tf arch.tar --quoting-style=c "./" "./a space" "./a'single'quote" "./a\"double\"quote" "./a\\backslash" "./a\ttab" "./a\nnewline"
Control characters are printed using backslash notation, and a backslash as ‘\\’. This is the default quoting style, unless it was changed when configured the package.
$ tar tf arch.tar --quoting-style=escape ./ ./a space ./a'single'quote ./a"double"quote ./a\\backslash ./a\ttab ./a\nnewline
Control characters, single quote and backslash are printed using backslash notation. All names are quoted using left and right quotation marks, appropriate to the current locale. If it does not define quotation marks, use ‘'’ as left and as right quotation marks. Any occurrences of the right quotation mark in a name are escaped with ‘\’, for example:
For example:
$ tar tf arch.tar --quoting-style=locale './' './a space' './a\'single\'quote' './a"double"quote' './a\\backslash' './a\ttab' './a\nnewline'
Same as ‘locale’, but ‘"’ is used for both left and right quotation marks, if not provided by the currently selected locale:
$ tar tf arch.tar --quoting-style=clocale "./" "./a space" "./a'single'quote" "./a\"double\"quote" "./a\\backslash" "./a\ttab" "./a\nnewline"
You can specify which characters should be quoted in addition to those implied by the current quoting style:
Always quote characters from string, even if the selected quoting style would not quote them.
For example, using ‘escape’ quoting (compare with the usual escape listing above):
$ tar tf arch.tar --quoting-style=escape --quote-chars=' "' ./ ./a\ space ./a'single'quote ./a\"double\"quote ./a\\backslash ./a\ttab ./a\nnewline
To disable quoting of such additional characters, use the following option:
Remove characters listed in string from the list of quoted characters set by the previous ‘--quote-chars’ option.
This option is particularly useful if you have added
‘--quote-chars’ to your TAR_OPTIONS
(see TAR_OPTIONS)
and wish to disable it for the current invocation.
Note, that ‘--no-quote-chars’ does not disable those characters that are quoted by default in the selected quoting style.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Tar
archives contain detailed information about files stored
in them and full file names are part of that information. When
storing a file to an archive, its file name is recorded in it,
along with the actual file contents. When restoring from an archive,
a file is created on disk with exactly the same name as that stored
in the archive. In the majority of cases this is the desired behavior
of a file archiver. However, there are some cases when it is not.
First of all, it is often unsafe to extract archive members with
absolute file names or those that begin with a ‘../’. GNU tar
takes special precautions when extracting such names and provides a
special option for handling them, which is described in
Absolute File Names.
Secondly, you may wish to extract file names without some leading directory components, or with otherwise modified names. In other cases it is desirable to store files under differing names in the archive.
GNU tar
provides several options for these needs.
Strip given number of leading components from file names before extraction.
For example, suppose you have archived whole ‘/usr’ hierarchy to a tar archive named ‘usr.tar’. Among other files, this archive contains ‘usr/include/stdlib.h’, which you wish to extract to the current working directory. To do so, you type:
$ tar -xf usr.tar --strip=2 usr/include/stdlib.h
The option ‘--strip=2’ instructs tar
to strip the
two leading components (‘usr/’ and ‘include/’) off the file
name.
If you add the ‘--verbose’ (‘-v’) option to the invocation
above, you will note that the verbose listing still contains the
full file name, with the two removed components still in place. This
can be inconvenient, so tar
provides a special option for
altering this behavior:
Display file or member names with all requested transformations applied.
For example:
$ tar -xf usr.tar -v --strip=2 usr/include/stdlib.h usr/include/stdlib.h $ tar -xf usr.tar -v --strip=2 --show-transformed usr/include/stdlib.h stdlib.h
Notice that in both cases the file ‘stdlib.h’ is extracted to the current working directory, ‘--show-transformed-names’ affects only the way its name is displayed.
This option is especially useful for verifying whether the invocation will have the desired effect. Thus, before running
$ tar -x --strip=n
it is often advisable to run
$ tar -t -v --show-transformed --strip=n
to make sure the command will produce the intended results.
In case you need to apply more complex modifications to the file name,
GNU tar
provides a general-purpose transformation option:
Modify file names using supplied expression.
The expression is a sed
-like replace expression of the
form:
s/regexp/replace/[flags]
where regexp is a regular expression, replace is a replacement for each file name part that matches regexp. Both regexp and replace are described in detail in The ‘s’ Command in GNU sed.
Any delimiter can be used in lieu of ‘/’, the only requirement being that it be used consistently throughout the expression. For example, the following two expressions are equivalent:
s/one/two/ s,one,two,
Changing delimiters is often useful when the regex contains
slashes. For example, it is more convenient to write s,/,-,
than
s/\//-/
.
As in sed
, you can give several replace expressions,
separated by a semicolon.
Supported flags are:
Apply the replacement to all matches to the regexp, not just the first.
Use case-insensitive matching.
regexp is an extended regular expression (see Extended regular expressions in GNU sed).
Only replace the numberth match of the regexp.
Note: the POSIX standard does not specify what should happen
when you mix the ‘g’ and number modifiers. GNU tar
follows the GNU sed
implementation in this regard, so
the interaction is defined to be: ignore matches before the
numberth, and then match and replace all matches from the
numberth on.
In addition, several transformation scope flags are supported, that control to what files transformations apply. These are:
Apply transformation to regular archive members.
Do not apply transformation to regular archive members.
Apply transformation to symbolic link targets.
Do not apply transformation to symbolic link targets.
Apply transformation to hard link targets.
Do not apply transformation to hard link targets.
Default is ‘rsh’, which means to apply transformations to both archive members and targets of symbolic and hard links.
Default scope flags can also be changed using ‘flags=’ statement in the transform expression. The flags set this way remain in force until next ‘flags=’ statement or end of expression, whichever occurs first. For example:
--transform 'flags=S;s|^|/usr/local/|'
Here are several examples of ‘--transform’ usage:
$ tar --transform='s,usr/,usr/local/,' -x -f arch.tar
$ tar --transform='s,/*[^/]*/[^/]*/,,' -x -f arch.tar
$ tar --transform 's/.*/\L&/' -x -f arch.tar
$ tar --transform 's,^,/prefix/,' -x -f arch.tar
$ tar --transform 's,^,/usr/local/,S' -c -f arch.tar /lib
Notice the use of flags in the last example. The ‘/lib’ directory often contains many symbolic links to files within it. It may look, for example, like this:
$ ls -l drwxr-xr-x root/root 0 2008-07-08 16:20 /lib/ -rwxr-xr-x root/root 1250840 2008-05-25 07:44 /lib/libc-2.3.2.so lrwxrwxrwx root/root 0 2008-06-24 17:12 /lib/libc.so.6 -> libc-2.3.2.so ...
Using the expression ‘s,^,/usr/local/,’ would mean adding ‘/usr/local’ to both regular archive members and to link targets. In this case, ‘/lib/libc.so.6’ would become:
/usr/local/lib/libc.so.6 -> /usr/local/libc-2.3.2.so
This is definitely not desired. To avoid this, the ‘S’ flag is used, which excludes symbolic link targets from filename transformations. The result is:
$ tar --transform 's,^,/usr/local/,S' -c -v -f arch.tar \ --show-transformed /lib drwxr-xr-x root/root 0 2008-07-08 16:20 /usr/local/lib/ -rwxr-xr-x root/root 1250840 2008-05-25 07:44 /usr/local/lib/libc-2.3.2.so lrwxrwxrwx root/root 0 2008-06-24 17:12 /usr/local/lib/libc.so.6 \ -> libc-2.3.2.so
Unlike ‘--strip-components’, ‘--transform’ can be used
in any GNU tar
operation mode. For example, the following command
adds files to the archive while replacing the leading ‘usr/’
component with ‘var/’:
$ tar -cf arch.tar --transform='s,^usr/,var/,' /
To test ‘--transform’ effect we suggest using ‘--show-transformed-names’ option:
$ tar -cf arch.tar --transform='s,^usr/,var/,' \ --verbose --show-transformed-names /
If both ‘--strip-components’ and ‘--transform’ are used together, then ‘--transform’ is applied first, and the required number of components is then stripped from its result.
You can use as many ‘--transform’ options in a single command line as you want. The specified expressions will then be applied in order of their appearance. For example, the following two invocations are equivalent:
$ tar -cf arch.tar --transform='s,/usr/var,/var/' \ --transform='s,/usr/local,/usr/,' $ tar -cf arch.tar \ --transform='s,/usr/var,/var/;s,/usr/local,/usr/,'
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The ‘--after-date=date’ (‘--newer=date’,
‘-N date’) option causes tar
to only work on
files whose data modification or status change times are newer than
the date given. If date starts with ‘/’ or ‘.’,
it is taken to be a file name; the data modification time of that file
is used as the date. If you use this option when creating or appending
to an archive, the archive will only include new files. If you use
‘--after-date’ when extracting an archive, tar
will
only extract files newer than the date you specify.
If you want tar
to make the date comparison based only on
modification of the file’s data (rather than status
changes), then use the ‘--newer-mtime=date’ option.
You may use these options with any operation. Note that these options
differ from the ‘--update’ (‘-u’) operation in that they
allow you to specify a particular date against which tar
can
compare when deciding whether or not to archive the files.
Only store files newer than date.
Acts on files only if their data modification or status change times are later than date. Use in conjunction with any operation.
If date starts with ‘/’ or ‘.’, it is taken to be a file name; the data modification time of that file is used as the date.
Act like ‘--after-date’, but look only at data modification times.
These options limit tar
to operate only on files which have
been modified after the date specified. A file’s status is considered to have
changed if its contents have been modified, or if its owner,
permissions, and so forth, have been changed. (For more information on
how to specify a date, see Date input formats; remember that the
entire date argument must be quoted if it contains any spaces.)
Gurus would say that ‘--after-date’ tests both the data
modification time (mtime
, the time the contents of the file
were last modified) and the status change time (ctime
, the time
the file’s status was last changed: owner, permissions, etc.)
fields, while ‘--newer-mtime’ tests only the mtime
field.
To be precise, ‘--after-date’ checks both mtime
and
ctime
and processes the file if either one is more recent than
date, while ‘--newer-mtime’ checks only mtime
and
disregards ctime
. Neither option uses atime
(the last time the
contents of the file were looked at).
Date specifiers can have embedded spaces. Because of this, you may need to quote date arguments to keep the shell from parsing them as separate arguments. For example, the following command will add to the archive all the files modified less than two days ago:
$ tar -cf foo.tar --newer-mtime '2 days ago'
When any of these options is used with the option ‘--verbose’
(see section The ‘--verbose’ Option) GNU tar
converts the specified
date back to a textual form and compares that with the
one given with the option. If the two forms differ, tar
prints both forms in a message, to help the user check that the right
date is being used. For example:
$ tar -c -f archive.tar --after-date='10 days ago' . tar: Option --after-date: Treating date '10 days ago' as 2006-06-11 13:19:37.232434
Please Note: ‘--after-date’ and ‘--newer-mtime’ should not be used for incremental backups. See section Using
tar
to Perform Incremental Dumps, for proper way of creating incremental backups.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Usually, tar
will recursively explore all directories (either
those given on the command line or through the ‘--files-from’
option) for the various files they contain. However, you may not always
want tar
to act this way.
The ‘--no-recursion’ option inhibits tar
’s recursive descent
into specified directories. If you specify ‘--no-recursion’, you can
use the find
(see find in GNU Find Manual)
utility for hunting through levels of directories to
construct a list of file names which you could then pass to tar
.
find
allows you to be more selective when choosing which files to
archive; see Reading Names from a File, for more information on using find
with
tar
.
Prevents tar
from recursively descending directories.
Requires tar
to recursively descend directories.
This is the default.
When you use ‘--no-recursion’, GNU tar
grabs
directory entries themselves, but does not descend on them
recursively. Many people use find
for locating files they
want to back up, and since tar
usually recursively
descends on directories, they have to use the ‘-not -type d’
test in their find
invocation (see Type test in Finding Files), as they usually do not want all the files in a
directory. They then use the ‘--files-from’ option to archive
the files located via find
.
The problem when restoring files archived in this manner is that the
directories themselves are not in the archive; so the
‘--same-permissions’ (‘--preserve-permissions’,
‘-p’) option does not affect them—while users might really
like it to. Specifying ‘--no-recursion’ is a way to tell
tar
to grab only the directory entries given to it, adding
no new files on its own. To summarize, if you use find
to
create a list of files to be stored in an archive, use it as follows:
$ find dir tests | \ tar -cf archive --no-recursion -T -
The ‘--no-recursion’ option also applies when extracting: it
causes tar
to extract only the matched directory entries, not
the files under those directories.
The ‘--no-recursion’ option also affects how globbing patterns are interpreted (see section Controlling Pattern-Matching).
The ‘--no-recursion’ and ‘--recursion’ options apply to later options and operands, and can be overridden by later occurrences of ‘--no-recursion’ and ‘--recursion’. For example:
$ tar -cf jams.tar --no-recursion grape --recursion grape/concord
creates an archive with one entry for ‘grape’, and the recursive contents of ‘grape/concord’, but no entries under ‘grape’ other than ‘grape/concord’.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
tar
will normally automatically cross file system boundaries in
order to archive files which are part of a directory tree. You can
change this behavior by running tar
and specifying
‘--one-file-system’. This option only affects files that are
archived because they are in a directory that is being archived;
tar
will still archive files explicitly named on the command line
or through ‘--files-from’, regardless of where they reside.
Prevents tar
from crossing file system boundaries when
archiving. Use in conjunction with any write operation.
The ‘--one-file-system’ option causes tar
to modify its
normal behavior in archiving the contents of directories. If a file in
a directory is not on the same file system as the directory itself, then
tar
will not archive that file. If the file is a directory
itself, tar
will not archive anything beneath it; in other words,
tar
will not cross mount points.
This option is useful for making full or incremental archival backups of a file system. If this option is used in conjunction with ‘--verbose’ (‘-v’), files that are excluded are mentioned by name on the standard error.
6.10.1 Changing the Working Directory | Changing Directory | |
6.10.2 Absolute File Names |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To change the working directory in the middle of a list of file names, either on the command line or in a file specified using ‘--files-from’ (‘-T’), use ‘--directory’ (‘-C’). This will change the working directory to the specified directory after that point in the list.
Changes the working directory in the middle of a command line.
For example,
$ tar -c -f jams.tar grape prune -C food cherry
will place the files ‘grape’ and ‘prune’ from the current directory into the archive ‘jams.tar’, followed by the file ‘cherry’ from the directory ‘food’. This option is especially useful when you have several widely separated files that you want to store in the same archive.
Note that the file ‘cherry’ is recorded in the archive under the precise name ‘cherry’, not ‘food/cherry’. Thus, the archive will contain three files that all appear to have come from the same directory; if the archive is extracted with plain ‘tar --extract’, all three files will be written in the current directory.
Contrast this with the command,
$ tar -c -f jams.tar grape prune -C food red/cherry
which records the third file in the archive under the name ‘red/cherry’ so that, if the archive is extracted using ‘tar --extract’, the third file will be written in a subdirectory named ‘red’.
You can use the ‘--directory’ option to make the archive independent of the original name of the directory holding the files. The following command places the files ‘/etc/passwd’, ‘/etc/hosts’, and ‘/lib/libc.a’ into the archive ‘foo.tar’:
$ tar -c -f foo.tar -C /etc passwd hosts -C /lib libc.a
However, the names of the archive members will be exactly what they were on the command line: ‘passwd’, ‘hosts’, and ‘libc.a’. They will not appear to be related by file name to the original directories where those files were located.
Note that ‘--directory’ options are interpreted consecutively. If
‘--directory’ specifies a relative file name, it is interpreted
relative to the then current directory, which might not be the same as
the original current working directory of tar
, due to a previous
‘--directory’ option.
When using ‘--files-from’ (see section Reading Names from a File), you can put various
tar
options (including ‘-C’) in the file list. Notice,
however, that in this case the option and its argument may not be
separated by whitespace. If you use short option, its argument must
either follow the option letter immediately, without any intervening
whitespace, or occupy the next line. Otherwise, if you use long
option, separate its argument by an equal sign.
For instance, the file list for the above example will be:
-C/etc passwd hosts --directory=/lib libc.a
To use it, you would invoke tar
as follows:
$ tar -c -f foo.tar --files-from list
The interpretation of options in file lists is disabled by ‘--verbatim-files-from’ and ‘--null’ options.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
By default, GNU tar
drops a leading ‘/’ on
input or output, and complains about file names containing a ‘..’
component. There is an option that turns off this behavior:
Do not strip leading slashes from file names, and permit file names containing a ‘..’ file name component.
When tar
extracts archive members from an archive, it strips any
leading slashes (‘/’) from the member name. This causes absolute
member names in the archive to be treated as relative file names. This
allows you to have such members extracted wherever you want, instead of
being restricted to extracting the member in the exact directory named
in the archive. For example, if the archive member has the name
‘/etc/passwd’, tar
will extract it as if the name were
really ‘etc/passwd’.
File names containing ‘..’ can cause problems when extracting, so
tar
normally warns you about such files when creating an
archive, and rejects attempts to extracts such files.
Other tar
programs do not do this. As a result, if you
create an archive whose member names start with a slash, they will be
difficult for other people with a non-GNU tar
program to use. Therefore, GNU tar
also strips
leading slashes from member names when putting members into the
archive. For example, if you ask tar
to add the file
‘/bin/ls’ to an archive, it will do so, but the member name will
be ‘bin/ls’(20).
Symbolic links containing ‘..’ or leading ‘/’ can also cause
problems when extracting, so tar
normally extracts them last;
it may create empty files as placeholders during extraction.
If you use the ‘--absolute-names’ (‘-P’) option,
tar
will do none of these transformations.
To archive or extract files relative to the root directory, specify the ‘--absolute-names’ (‘-P’) option.
Normally, tar
acts on files relative to the working
directory—ignoring superior directory names when archiving, and
ignoring leading slashes when extracting.
When you specify ‘--absolute-names’ (‘-P’),
tar
stores file names including all superior directory
names, and preserves leading slashes. If you only invoked
tar
from the root directory you would never need the
‘--absolute-names’ option, but using this option
may be more convenient than switching to root.
Preserves full file names (including superior directory names) when archiving and extracting files.
tar
prints out a message about removing the ‘/’ from
file names. This message appears once per GNU tar
invocation. It represents something which ought to be told; ignoring
what it means can cause very serious surprises, later.
Some people, nevertheless, do not want to see this message. Wanting to
play really dangerously, one may of course redirect tar
standard
error to the sink. For example, under sh
:
$ tar -c -f archive.tar /home 2> /dev/null
Another solution, both nicer and simpler, would be to change to the ‘/’ directory first, and then avoid absolute notation. For example:
$ tar -c -f archive.tar -C / home
See section Integrity, for some of the security-related implications of using this option.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated on August 23, 2023 using texi2html 5.0.