4.10.1 Using getline with No Arguments

The getline function can be used without arguments to read input from the current input file. All it does in this case is read the next input record and split it up into fields. This is useful if you’ve finished processing the current record, but want to do some special processing on the next record right now. For example:

# Remove text between /* and */, inclusive
{
    while ((start = index($0, "/*")) != 0) {
        out = substr($0, 1, start - 1)  # leading part of the string
        rest = substr($0, start + 2)    # ... */ ...
        while ((end = index(rest, "*/")) == 0) {  # is */ in trailing part?
            # get more text
            if (getline <= 0) {
                print("unexpected EOF or error:", ERRNO) > "/dev/stderr"
                exit
            }
            # build up the line using string concatenation
            rest = rest $0
        }
        rest = substr(rest, end + 2)  # remove comment
        # build up the output line using string concatenation
        $0 = out rest
    }
    print $0
}

This awk program deletes C-style comments (‘/* … */’) from the input. It uses a number of features we haven’t covered yet, including string concatenation (see String Concatenation) and the index() and substr() built-in functions (see String-Manipulation Functions). By replacing the ‘print $0’ with other statements, you could perform more complicated processing on the decommented input, such as searching for matches of a regular expression.

Here is some sample input:

mon/*comment*/key
rab/*commen
t*/bit
horse /*comment*/more text
part 1 /*comment*/part 2 /*comment*/part 3
no comment

When run, the output is:

$ awk -f strip_comments.awk example_text
-| monkey
-| rabbit
-| horse more text
-| part 1 part 2 part 3
-| no comment

This form of the getline function sets NF, NR, FNR, RT, and the value of $0.

NOTE: The new value of $0 is used to test the patterns of any subsequent rules. The original value of $0 that triggered the rule that executed getline is lost. By contrast, the next statement reads a new record but immediately begins processing it normally, starting with the first rule in the program. See The next Statement.