10.2.8 Reading a Whole File at Once

Often, it is convenient to have the entire contents of a file available in memory as a single string. A straightforward but naive way to do that might be as follows:

function readfile1(file,    tmp, contents)
{
    if ((getline tmp < file) < 0)
        return

    contents = tmp RT
    while ((getline tmp < file) > 0)
        contents = contents tmp RT

    close(file)
    return contents
}

This function reads from file one record at a time, building up the full contents of the file in the local variable contents. It works, but is not necessarily efficient.

The following function, based on a suggestion by Denis Shirokov, reads the entire contents of the named file in one shot:

# readfile.awk --- read an entire file at once

function readfile(file,     tmp, save_rs)
{
    save_rs = RS
    RS = "^$"
    getline tmp < file
    close(file)
    RS = save_rs

    return tmp
}

It works by setting RS to ‘^$’, a regular expression that will never match if the file has contents. gawk reads data from the file into tmp, attempting to match RS. The match fails after each read, but fails quickly, such that gawk fills tmp with the entire contents of the file. (See How Input Is Split into Records for information on RT and RS.)

In the case that file is empty, the return value is the null string. Thus, calling code may use something like:

contents = readfile("/some/path")
if (length(contents) == 0)
    # file was empty ...

This tests the result to see if it is empty or not. An equivalent test would be ‘contents == ""’.

See Reading an Entire File for an extension function that also reads an entire file into memory.