Often, it is convenient to have the entire contents of a file available in memory as a single string. A straightforward but naive way to do that might be as follows:
function readfile1(file, tmp, contents) { if ((getline tmp < file) < 0) return contents = tmp RT while ((getline tmp < file) > 0) contents = contents tmp RT close(file) return contents }
This function reads from file
one record at a time, building
up the full contents of the file in the local variable contents
.
It works, but is not necessarily efficient.
The following function, based on a suggestion by Denis Shirokov, reads the entire contents of the named file in one shot:
# readfile.awk --- read an entire file at once function readfile(file, tmp, save_rs) { save_rs = RS RS = "^$" getline tmp < file close(file) RS = save_rs return tmp }
It works by setting RS
to ‘^$’, a regular expression that
will never match if the file has contents. gawk
reads data from
the file into tmp
, attempting to match RS
. The match fails
after each read, but fails quickly, such that gawk
fills
tmp
with the entire contents of the file.
(See How Input Is Split into Records for information on RT
and RS
.)
In the case that file
is empty, the return value is the null
string. Thus, calling code may use something like:
contents = readfile("/some/path") if (length(contents) == 0) # file was empty ...
This tests the result to see if it is empty or not. An equivalent test would be ‘contents == ""’.
See Reading an Entire File for an extension function that also reads an entire file into memory.