It is important to remember that when you assign a string constant
as the value of FS
, it undergoes normal awk
string
processing. For example, with Unix awk
and gawk
,
the assignment ‘FS = "\.."’ assigns the character string ".."
to FS
(the backslash is stripped). This creates a regexp meaning
“fields are separated by occurrences of any two characters.”
If instead you want fields to be separated by a literal period followed
by any single character, use ‘FS = "\\.."’.
The following list summarizes how fields are split, based on the value
of FS
(‘==’ means “is equal to”):
gawk
was invoked with --csv
Field splitting follows the rules given in Working With Comma Separated Value Files.
The value of FS
is ignored.
FS == " "
Fields are separated by runs of whitespace. Leading and trailing whitespace are ignored. This is the default.
FS == any other single character
Fields are separated by each occurrence of the character. Multiple successive occurrences delimit empty fields, as do leading and trailing occurrences. The character can even be a regexp metacharacter; it does not need to be escaped.
FS == regexp
Fields are separated by occurrences of characters that match regexp. Leading and trailing matches of regexp delimit empty fields.
FS == ""
Each individual character in the record becomes a separate field. (This is a common extension; it is not specified by the POSIX standard.)
FS and IGNORECASE |
---|
The FS = "c" IGNORECASE = 1 $0 = "aCa" print $1 The output is ‘aCa’. If you really want to split fields on an
alphabetic character while ignoring case, use a regexp that will
do it for you (e.g., ‘FS = "[c]"’). In this case, |