[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The previous chapter showed how to use ‘--extract’ to extract
an archive into the file system. Various options cause tar
to
extract more information than just file contents, such as the owner,
the permissions, the modification date, and so forth. This section
presents options to be used with ‘--extract’ when certain special
considerations arise. You may review the information presented in
How to Extract Members from an Archive for more basic information about the
‘--extract’ operation.
4.4.1 Options to Help Read Archives | ||
4.4.2 Changing How tar Writes Files | ||
4.4.3 Coping with Scarce Resources |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Normally, tar
will request data in full record increments from
an archive storage device. If the device cannot return a full record,
tar
will report an error. However, some devices do not always
return full records, or do not require the last record of an archive to
be padded out to the next record boundary. To keep reading until you
obtain a full record, or to accept an incomplete record if it contains
an end-of-archive marker, specify the ‘--read-full-records’ (‘-B’) option
in conjunction with the ‘--extract’ or ‘--list’ operations.
See section Blocking.
The ‘--read-full-records’ (‘-B’) option is turned on by default when
tar
reads an archive from standard input, or from a remote
machine. This is because on BSD Unix systems, attempting to read a
pipe returns however much happens to be in the pipe, even if it is
less than was requested. If this option were not enabled, tar
would fail as soon as it read an incomplete record from the pipe.
If you’re not sure of the blocking factor of an archive, you can read the archive by specifying ‘--read-full-records’ (‘-B’) and ‘--blocking-factor=512-size’ (‘-b 512-size’), using a blocking factor larger than what the archive uses. This lets you avoid having to determine the blocking factor of an archive. See section The Blocking Factor of an Archive.
Reading Full Records | ||
Ignoring Blocks of Zeros |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Use in conjunction with ‘--extract’ (‘--get’, ‘-x’) to read an archive which contains incomplete records, or one which has a blocking factor less than the one specified.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Normally, tar
stops reading when it encounters a block of zeros
between file entries (which usually indicates the end of the archive).
‘--ignore-zeros’ (‘-i’) allows tar
to
completely read an archive which contains a block of zeros before the
end (i.e., a damaged archive, or one that was created by concatenating
several archives together). This option also suppresses warnings
about missing or incomplete zero blocks at the end of the archive.
This can be turned on, if the need be, using the
‘--warning=alone-zero-block --warning=missing-zero-blocks’
options (see section Controlling Warning Messages).
The ‘--ignore-zeros’ (‘-i’) option is turned off by default because many
versions of tar
write garbage after the end-of-archive entry,
since that part of the media is never supposed to be read. GNU tar
does not write after the end of an archive, but seeks to
maintain compatibility among archiving utilities.
To ignore blocks of zeros (i.e., end-of-archive entries) which may be encountered while reading an archive. Use in conjunction with ‘--extract’ or ‘--list’.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
tar
Writes Files(This message will disappear, once this node revised.)
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When extracting files, if tar
discovers that the extracted
file already exists, it normally replaces the file by removing it before
extracting it, to prevent confusion in the presence of hard or symbolic
links. (If the existing file is a symbolic link, it is removed, not
followed.) However, if a directory cannot be removed because it is
nonempty, tar
normally overwrites its metadata (ownership,
permission, etc.). The ‘--overwrite-dir’ option enables this
default behavior. To be more cautious and preserve the metadata of
such a directory, use the ‘--no-overwrite-dir’ option.
To be even more cautious and prevent existing files from being replaced, use
the ‘--keep-old-files’ (‘-k’) option. It causes
tar
to refuse to replace or update a file that already
exists, i.e., a file with the same name as an archive member prevents
extraction of that archive member. Instead, it reports an error. For
example:
$ ls blues $ tar -x -k -f archive.tar tar: blues: Cannot open: File exists tar: Exiting with failure status due to previous errors
If you wish to preserve old files untouched, but don’t want
tar
to treat them as errors, use the
‘--skip-old-files’ option. This option causes tar
to
silently skip extracting over existing files.
To be more aggressive about altering existing files, use the
‘--overwrite’ option. It causes tar
to overwrite
existing files and to follow existing symbolic links when extracting.
Some people argue that GNU tar
should not hesitate
to overwrite files with other files when extracting. When extracting
a tar
archive, they expect to see a faithful copy of the
state of the file system when the archive was created. It is debatable
that this would always be a proper behavior. For example, suppose one
has an archive in which ‘usr/local’ is a link to
‘usr/local2’. Since then, maybe the site removed the link and
renamed the whole hierarchy from ‘/usr/local2’ to
‘/usr/local’. Such things happen all the time. I guess it would
not be welcome at all that GNU tar
removes the
whole hierarchy just to make room for the link to be reinstated
(unless it also simultaneously restores the full
‘/usr/local2’, of course!) GNU tar
is indeed
able to remove a whole hierarchy to reestablish a symbolic link, for
example, but only if ‘--recursive-unlink’ is specified
to allow this behavior. In any case, single files are silently
removed.
Finally, the ‘--unlink-first’ (‘-U’) option can improve performance in
some cases by causing tar
to remove files unconditionally
before extracting them.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Overwrite existing files and directory metadata when extracting files from an archive.
This causes tar
to write extracted files into the file system without
regard to the files already on the system; i.e., files with the same
names as archive members are overwritten when the archive is extracted.
It also causes tar
to extract the ownership, permissions,
and time stamps onto any preexisting files or directories.
If the name of a corresponding file name is a symbolic link, the file
pointed to by the symbolic link will be overwritten instead of the
symbolic link itself (if this is possible). Moreover, special devices,
empty directories and even symbolic links are automatically removed if
they are in the way of extraction.
Be careful when using the ‘--overwrite’ option, particularly when combined with the ‘--absolute-names’ (‘-P’) option, as this combination can change the contents, ownership or permissions of any file on your system. Also, many systems do not take kindly to overwriting files that are currently being executed.
Overwrite the metadata of directories when extracting files from an archive, but remove other files before extracting.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
GNU tar
provides two options to control its actions in a situation
when it is about to extract a file which already exists on disk.
Do not replace existing files from archive. When such a file is
encountered, tar
issues an error message. Upon end of
extraction, tar
exits with code 2 (see exit status).
Do not replace existing files from archive, but do not treat that
as error. Such files are silently skipped and do not affect
tar
exit status.
Additional verbosity can be obtained using ‘--warning=existing-file’ together with that option (see section Controlling Warning Messages).
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Do not replace existing files that are newer than their archive copies. This option is meaningless with ‘--list’ (‘-t’).
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Remove files before extracting over them.
This can make tar
run a bit faster if you know in advance
that the extracted files all need to be removed. Normally this option
slows tar
down slightly, so it is disabled by default.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When this option is specified, try removing files and directory hierarchies before extracting over them. This is a dangerous option!
If you specify the ‘--recursive-unlink’ option,
tar
removes anything that keeps you from extracting a file
as far as current permissions will allow it. This could include removal
of the contents of a full directory hierarchy.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Normally, tar
sets the data modification times of extracted
files to the corresponding times recorded for the files in the archive, but
limits the permissions of extracted files by the current umask
setting.
To set the data modification times of extracted files to the time when the files were extracted, use the ‘--touch’ (‘-m’) option in conjunction with ‘--extract’ (‘--get’, ‘-x’).
Sets the data modification time of extracted archive members to the time they were extracted, not the time recorded for them in the archive. Use in conjunction with ‘--extract’ (‘--get’, ‘-x’).
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To set the modes (access permissions) of extracted files to those recorded for those files in the archive, use ‘--same-permissions’ in conjunction with the ‘--extract’ (‘--get’, ‘-x’) operation.
Set modes of extracted archive members to those recorded in the archive, instead of current umask settings. Use in conjunction with ‘--extract’ (‘--get’, ‘-x’).
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
After successfully extracting a file member, GNU tar
normally
restores its permissions and modification times, as described in the
previous sections. This cannot be done for directories, because
after extracting a directory tar
will almost certainly
extract files into that directory and this will cause the directory
modification time to be updated. Moreover, restoring that directory
permissions may not permit file creation within it. Thus, restoring
directory permissions and modification times must be delayed at least
until all files have been extracted into that directory. GNU tar
restores directories using the following approach.
The extracted directories are created with the mode specified in the
archive, as modified by the umask of the user, which gives sufficient
permissions to allow file creation. The meta-information about the
directory is recorded in the temporary list of directories. When
preparing to extract next archive member, GNU tar
checks if the
directory prefix of this file contains the remembered directory. If
it does not, the program assumes that all files have been extracted
into that directory, restores its modification time and permissions
and removes its entry from the internal list. This approach allows
to correctly restore directory meta-information in the majority of
cases, while keeping memory requirements sufficiently small. It is
based on the fact, that most tar
archives use the predefined
order of members: first the directory, then all the files and
subdirectories in that directory.
However, this is not always true. The most important exception are
incremental archives (see section Using tar
to Perform Incremental Dumps). The member order in
an incremental archive is reversed: first all directory members are
stored, followed by other (non-directory) members. So, when extracting
from incremental archives, GNU tar
alters the above procedure. It
remembers all restored directories, and restores their meta-data
only after the entire archive has been processed. Notice, that you do
not need to specify any special options for that, as GNU tar
automatically detects archives in incremental format.
There may be cases, when such processing is required for normal archives too. Consider the following example:
$ tar --no-recursion -cvf archive \ foo foo/file1 bar bar/file foo/file2 foo/ foo/file1 bar/ bar/file foo/file2
During the normal operation, after encountering ‘bar’
GNU tar
will assume that all files from the directory ‘foo’
were already extracted and will therefore restore its timestamp and
permission bits. However, after extracting ‘foo/file2’ the
directory timestamp will be offset again.
To correctly restore directory meta-information in such cases, use the ‘--delay-directory-restore’ command line option:
Delays restoring of the modification times and permissions of extracted directories until the end of extraction. This way, correct meta-information is restored even if the archive has unusual member ordering.
Cancel the effect of the previous ‘--delay-directory-restore’.
Use this option if you have used ‘--delay-directory-restore’ in
TAR_OPTIONS
variable (see TAR_OPTIONS) and wish to
temporarily disable it.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To write the extracted files to the standard output, instead of creating the files on the file system, use ‘--to-stdout’ (‘-O’) in conjunction with ‘--extract’ (‘--get’, ‘-x’). This option is useful if you are extracting files to send them through a pipe, and do not need to preserve them in the file system. If you extract multiple members, they appear on standard output concatenated, in the order they are found in the archive.
Writes files to the standard output. Use only in conjunction with
‘--extract’ (‘--get’, ‘-x’). When this option is
used, instead of creating the files specified, tar
writes
the contents of the files extracted to its standard output. This may
be useful if you are only extracting the files in order to send them
through a pipe. This option is meaningless with ‘--list’
(‘-t’).
This can be useful, for example, if you have a tar archive containing a big file and don’t want to store the file on disk before processing it. You can use a command like this:
tar -xOzf foo.tgz bigfile | process
or even like this if you want to process the concatenation of the files:
tar -xOzf foo.tgz bigfile1 bigfile2 | process
However, ‘--to-command’ may be more convenient for use with multiple files. See the next section.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
You can instruct tar
to send the contents of each extracted
file to the standard input of an external program:
Extract files and pipe their contents to the standard input of
command. When this option is used, instead of creating the
files specified, tar
invokes command and pipes the
contents of the files to its standard output. The command may
contain command line arguments (see Running External Commands,
for more detail).
Notice, that command is executed once for each regular file extracted. Non-regular files (directories, etc.) are ignored when this option is used.
The command can obtain the information about the file it processes from the following environment variables:
TAR_FILETYPE
Type of the file. It is a single letter with the following meaning:
f | Regular file |
d | Directory |
l | Symbolic link |
h | Hard link |
b | Block device |
c | Character device |
Currently only regular files are supported.
TAR_MODE
File mode, an octal number.
TAR_FILENAME
The name of the file.
TAR_REALNAME
Name of the file as stored in the archive.
TAR_UNAME
Name of the file owner.
TAR_GNAME
Name of the file owner group.
TAR_ATIME
Time of last access. It is a decimal number, representing seconds since the Epoch. If the archive provides times with nanosecond precision, the nanoseconds are appended to the timestamp after a decimal point.
TAR_MTIME
Time of last modification.
TAR_CTIME
Time of last status change.
TAR_SIZE
Size of the file.
TAR_UID
UID of the file owner.
TAR_GID
GID of the file owner.
Additionally, the following variables contain information about tar mode and the archive being processed:
TAR_VERSION
GNU tar
version number.
TAR_ARCHIVE
The name of the archive tar
is processing.
TAR_BLOCKING_FACTOR
Current blocking factor (see section Blocking).
TAR_VOLUME
Ordinal number of the volume tar
is processing.
TAR_FORMAT
Format of the archive being processed. See section Controlling the Archive Format, for a complete list of archive format names.
These variables are defined prior to executing the command, so you can pass them as arguments, if you prefer. For example, if the command proc takes the member name and size as its arguments, then you could do:
$ tar -x -f archive.tar \ --to-command='proc $TAR_FILENAME $TAR_SIZE'
Notice single quotes to prevent variable names from being expanded by
the shell when invoking tar
.
If command exits with a non-0 status, tar
will print
an error message similar to the following:
tar: 2345: Child returned status 1
Here, ‘2345’ is the PID of the finished process.
If this behavior is not wanted, use ‘--ignore-command-error’:
Ignore exit codes of subprocesses. Notice that if the program exits on signal or otherwise terminates abnormally, the error message will be printed even if this option is used.
Cancel the effect of any previous ‘--ignore-command-error’
option. This option is useful if you have set
‘--ignore-command-error’ in TAR_OPTIONS
(see TAR_OPTIONS) and wish to temporarily cancel it.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Remove files after adding them to the archive.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
(This message will disappear, once this node revised.)
Starting File | ||
Same Order |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Starts an operation in the middle of an archive. Use in conjunction with ‘--extract’ (‘--get’, ‘-x’) or ‘--list’ (‘-t’).
If a previous attempt to extract files failed due to lack of disk
space, you can use ‘--starting-file=name’ (‘-K
name’) to start extracting only after member name of the
archive. This assumes, of course, that there is now free space, or
that you are now extracting into a different file system. (You could
also choose to suspend tar
, remove unnecessary files from
the file system, and then resume the same tar
operation.
In this case, ‘--starting-file’ is not necessary.) See also
Asking for Confirmation During Operations, and Excluding Some Files.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To process large lists of file names on machines with small amounts of memory. Use in conjunction with ‘--compare’ (‘--diff’, ‘-d’), ‘--list’ (‘-t’) or ‘--extract’ (‘--get’, ‘-x’).
The ‘--same-order’ (‘--preserve-order’, ‘-s’) option tells tar
that the list of file
names to be listed or extracted is sorted in the same order as the
files in the archive. This allows a large list of names to be used,
even on a small machine that would not otherwise be able to hold all
the names in memory at the same time. Such a sorted list can easily be
created by running ‘tar -t’ on the archive and editing its output.
This option is probably never needed on modern computer systems.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] |
This document was generated on August 23, 2023 using texi2html 5.0.