10.3 Updating A Timestamp File

Suppose we have a directory full of files which is maintained with a set of automated tools; perhaps one set of tools updates them and another set of tools uses the result. In this situation, it might be useful for the second set of tools to know if the files have recently been changed. It might be useful, for example, to have a ’timestamp’ file which gives the timestamp on the newest file in the collection.

We can use find to achieve this, but there are several different ways to do it.

10.3.1 Updating the Timestamp The Wrong Way

The obvious but wrong answer is just to use ‘-newer’:

find subdir -newer timestamp -exec touch -r {} timestamp \;

This does the right sort of thing but has a bug. Suppose that two files in the subdirectory have been updated, and that these are called file1 and file2. The command above will update timestamp with the modification time of file1 or that of file2, but we don’t know which one. Since the timestamps on file1 and file2 will in general be different, this could well be the wrong value.

One solution to this problem is to modify find to recheck the modification time of timestamp every time a file is to be compared against it, but that will reduce the performance of find.

10.3.2 Using the test utility to compare timestamps

The test command can be used to compare timestamps:

find subdir -exec test {} -nt timestamp \; -exec touch -r {} timestamp \;

This will ensure that any changes made to the modification time of timestamp that take place during the execution of find are taken into account. This resolves our earlier problem, but unfortunately this runs much more slowly.

10.3.3 A combined approach

We can of course still use ‘-newer’ to cut down on the number of calls to test:

find subdir -newer timestamp -and \
     -exec test {} -nt timestamp \; -and \
     -exec touch -r {} timestamp \;

Here, the ‘-newer’ test excludes all the files which are definitely older than the timestamp, but all the files which are newer than the old value of the timestamp are compared against the current updated timestamp.

This is indeed faster in general, but the speed difference will depend on how many updated files there are.

10.3.4 Using -printf and sort to compare timestamps

It is possible to use the ‘-printf’ action to abandon the use of test entirely:

newest=$(find subdir -newer timestamp -printf "%A@:%p\n" |
           sort -n |
           tail -n1 |
           cut -d: -f2- )
touch -r "${newest:-timestamp}" timestamp

The command above works by generating a list of the timestamps and names of all the files which are newer than the timestamp. The sort, tail and cut commands simply pull out the name of the file with the largest timestamp value (that is, the latest file). The touch command is then used to update the timestamp,

The "${newest:-timestamp}" expression simply expands to the value of $newest if that variable is set, but to timestamp otherwise. This ensures that an argument is always given to the ‘-r’ option of the touch command.

This approach seems quite efficient, but unfortunately it has a problem. Many operating systems now keep file modification time information at a granularity which is finer than one second. Findutils version 4.3.3 and later will print a fractional part with %A@, but older versions will not.

10.3.5 Solving the problem with make

Another tool which often works with timestamps is make. We can use find to generate a Makefile file on the fly and then use make to update the timestamps:

makefile=$(mktemp)
find subdir \
        \( \! -xtype l \) \
        -newer timestamp \
        -printf "timestamp:: %p\n\ttouch -r %p timestamp\n\n" > "$makefile"
make -f "$makefile"
rm   -f "$makefile"

Unfortunately although the solution above is quite elegant, it fails to cope with white space within file names, and adjusting it to do so would require a rather complex shell script.

10.3.6 Coping with odd filenames too

We can fix both of these problems (looping and problems with white space), and do things more efficiently too. The following command works with newlines and doesn’t need to sort the list of filenames.

find subdir -newer timestamp -printf "%A@:%p\0" |
   perl -0 newest.pl |
   xargs --no-run-if-empty --null --replace \
      find {} -maxdepth 0 -newer timestamp -exec touch -r {} timestamp \;

The first find command generates a list of files which are newer than the original timestamp file, and prints a list of them with their timestamps. The newest.pl script simply filters out all the filenames which have timestamps which are older than whatever the newest file is:

#! /usr/bin/perl -0
my @newest = ();
my $latest_stamp = undef;
while (<>) {
    my ($stamp, $name) = split(/:/);
    if (!defined($latest_stamp) || ($tstamp > $latest_stamp)) {
        $latest_stamp = $stamp;
        @newest = ();
    }
    if ($tstamp >= $latest_stamp) {
        push @newest, $name;
    }
}
print join("\0", @newest);

This prints a list of zero or more files, all of which are newer than the original timestamp file, and which have the same timestamp as each other, to the nearest second. The second find command takes each resulting file one at a time, and if that is newer than the timestamp file, the timestamp is updated.