awk
Statements Versus Lines ¶Most often, each line in an awk
program is a separate statement or
separate rule, like this:
awk '/12/ { print $0 } /21/ { print $0 }' mail-list inventory-shipped
However, gawk
ignores newlines after any of the following
symbols and keywords:
, { ? : || && do else
A newline at any other point is considered the end of the statement.9
If you would like to split a single statement into two lines at a point where a newline would terminate it, you can continue it by ending the first line with a backslash character (‘\’). The backslash must be the final character on the line in order to be recognized as a continuation character. A backslash followed by a newline is allowed anywhere in the statement, even in the middle of a string or regular expression. For example:
awk '/This regular expression is too long, so continue it\ on the next line/ { print $1 }'
We have generally not used backslash continuation in our sample programs.
gawk
places no limit on the
length of a line, so backslash continuation is never strictly necessary;
it just makes programs more readable. For this same reason, as well as
for clarity, we have kept most statements short in the programs
presented throughout the Web page.
Backslash continuation is
most useful when your awk
program is in a separate source file
instead of entered from the command line. You should also note that
many awk
implementations are more particular about where you
may use backslash continuation. For example, they may not allow you to
split a string constant using backslash continuation. Thus, for maximum
portability of your awk
programs, it is best not to split your
lines in the middle of a regular expression or a string.
CAUTION: Backslash continuation does not work as described with the C shell. It works for
awk
programs in files and for one-shot programs, provided you are using a POSIX-compliant shell, such as the Unix Bourne shell or Bash. But the C shell behaves differently! There you must use two backslashes in a row, followed by a newline. Note also that when using the C shell, every newline in yourawk
program must be escaped with a backslash. To illustrate:% awk 'BEGIN { \ ? print \\ ? "hello, world" \ ? }' -| hello, worldHere, the ‘%’ and ‘?’ are the C shell’s primary and secondary prompts, analogous to the standard shell’s ‘$’ and ‘>’.
Compare the previous example to how it is done with a POSIX-compliant shell:
$ awk 'BEGIN { > print \ > "hello, world" > }' -| hello, world
awk
is a line-oriented language. Each rule’s action has to
begin on the same line as the pattern. To have the pattern and action
on separate lines, you must use backslash continuation; there
is no other option.
Another thing to keep in mind is that backslash continuation and
comments do not mix. As soon as awk
sees the ‘#’ that
starts a comment, it ignores everything on the rest of the
line. For example:
$ gawk 'BEGIN { print "dont panic" # a friendly \ > BEGIN rule > }' error→ gawk: cmd. line:2: BEGIN rule error→ gawk: cmd. line:2: ^ syntax error
In this case, it looks like the backslash would continue the comment onto the
next line. However, the backslash-newline combination is never even
noticed because it is “hidden” inside the comment. Thus, the
BEGIN
is noted as a syntax error.
Backslash continuation comes into play in an additional, unexpected situation. Consider:
gawk -F'\ a' '...'
This command line assigns a value to FS
. But what value?
There are several possibilities, and in fact different versions of
awk
do different things. gawk
treats this as
if it were written:
BEGIN { FS = "\ a" } ...
In short, the backslash and newline are removed, assigning "a"
to FS
. This same treatment applies to variable assignments
made with the -v option (see Command-Line Options)
and to regular command-line variable assignments (see Assigning Variables on the Command Line).
If you’re interested, see
https://lists.gnu.org/archive/html/bug-gawk/2022-10/msg00025.html
for a source code patch that allows lines to be continued when inside
parentheses. This patch was not added to gawk
since it would
quietly decrease the portability of awk
programs.
When awk
statements within one rule are short, you might want to put
more than one of them on a line. This is accomplished by separating the statements
with a semicolon (‘;’).
This also applies to the rules themselves.
Thus, the program shown at the start of this section
could also be written this way:
/12/ { print $0 } ; /21/ { print $0 }
NOTE: The requirement that states that rules on the same line must be separated with a semicolon was not in the original
awk
language; it was added for consistency with the treatment of statements within an action.
The ‘?’ and ‘:’ referred to here is the
three-operand conditional expression described in
Conditional Expressions.
Splitting lines after ‘?’ and ‘:’ is a minor gawk
extension; if --posix is specified
(see Command-Line Options), then this extension is disabled.