gawk
Features ¶This section describes the features in gawk
over and above those in POSIX awk
,
in the order they were added to gawk
.
Version 2.10 of gawk
introduced the following features:
AWKPATH
environment variable for specifying a path search for
the -f command-line option
(see Command-Line Options).
IGNORECASE
variable and its effects
(see Case Sensitivity in Matching).
gawk
).
Version 2.13 of gawk
introduced the following features:
FIELDWIDTHS
variable and its effects
(see Reading Fixed-Width Data).
systime()
and strftime()
built-in functions for obtaining
and printing timestamps
(see Time Functions).
Version 2.14 of gawk
introduced the following feature:
next file
statement for skipping to the next data file
(see The nextfile
Statement).
Version 2.15 of gawk
introduced the following features:
ARGIND
, which tracks the movement of FILENAME
through ARGV
.
ERRNO
, which contains the system error message when
getline
returns −1 or close()
fails.
delete
Statement).
Version 3.0 of gawk
introduced the following features:
IGNORECASE
changed, now applying to string comparison as well
as regexp operations
(see Case Sensitivity in Matching).
RT
, which contains the input text that matched RS
(see How Input Is Split into Records).
gensub()
function for more powerful text manipulation
(see String-Manipulation Functions).
strftime()
function acquired a default time format,
allowing it to be called with no arguments
(see Time Functions).
FS
and for the third
argument to split()
to be null strings
(see Making Each Character a Separate Field).
RS
to be a regexp
(see How Input Is Split into Records).
next file
statement became nextfile
(see The nextfile
Statement).
fflush()
function from
BWK awk
(then at Bell Laboratories;
see Input/Output Functions).
awk
(see Major Changes Between V7 and SVR3.1).
awk
. (Brian was
still at Bell Laboratories at the time.) This was later removed from
both his awk
and from gawk
.
gawk
for Unix-Like Systems).
Version 3.1 of gawk
introduced the following features:
BINMODE
, for non-POSIX systems,
which allows binary I/O for input and/or output files
(see Using gawk
on PC Operating Systems).
LINT
, which dynamically controls lint warnings.
PROCINFO
, an array for providing process-related information.
TEXTDOMAIN
, for setting an application’s internationalization text domain
(see Internationalization with gawk
).
awk
program source code
(see Octal and Hexadecimal Numbers).
gawk
for Network Programming).
close()
that allows closing one end
of a two-way pipe to a coprocess
(see Two-Way Communications with Another Process).
match()
function
for capturing text-matching subexpressions within a regexp
(see String-Manipulation Functions).
printf
formats for
making translations easier
(see Rearranging printf
Arguments).
asort()
and asorti()
functions for sorting arrays
(see Controlling Array Traversal and Array Sorting).
bindtextdomain()
, dcgettext()
and dcngettext()
functions
for internationalization
(see Internationalizing awk
Programs).
extension()
function and the ability to add
new built-in functions dynamically. This has seen removed.
It was replaced by the new extension mechanism.
See Writing Extensions for gawk
.
mktime()
function for creating timestamps
(see Time Functions).
and()
, or()
, xor()
, compl()
,
lshift()
, rshift()
, and strtonum()
functions
(see Bit-Manipulation Functions).
nextfile
Statement).
pgawk
, the
profiling version of gawk
, for producing execution
profiles of awk
programs
(see Profiling Your awk
Programs).
gawk
to use the locale’s decimal point for parsing input data
(see Conversion of Strings and Numbers).
gawk
for Unix-Like Systems).
gettext
for gawk
’s own message output
(see gawk
Can Speak Your Language).
sub()
and gsub()
(see More about ‘\’ and ‘&’ with sub()
, gsub()
, and gensub()
).
length()
function was extended to accept an array argument
and return the number of elements in the array
(see String-Manipulation Functions).
strftime()
function acquired a third argument to
enable printing times as UTC
(see Time Functions).
Version 4.0 of gawk
introduced the following features:
FPAT
, which allows you to specify a regexp that matches
the fields, instead of matching the field separator
(see Defining Fields by Content).
PROCINFO["sorted_in"]
exists, ‘for (iggy in foo)’ loops sort the
indices before looping over them. The value of this element
provides control over how the indices are sorted before the loop
traversal starts
(see Using Predefined Array Scanning Orders with gawk
).
PROCINFO["strftime"]
, which holds
the default format for strftime()
(see Time Functions).
gawk
for Network Programming).
gawk
-Specific Regexp Operators).
break
and continue
became invalid outside a loop,
even with --traditional
(see The break
Statement, and also see
The continue
Statement).
fflush()
, nextfile
, and ‘delete array’
are allowed if --posix or --traditional, since they
are all now part of POSIX.
asort()
and asorti()
, specifying how to sort
(see String-Manipulation Functions).
fflush()
changed to match BWK awk
and for POSIX; now both ‘fflush()’ and ‘fflush("")’
flush all open output redirections
(see Input/Output Functions).
isarray()
function which distinguishes if an item is an array
or not, to make it possible to traverse arrays of arrays
(see Getting Type Information).
patsplit()
function which gives the same capability as FPAT
, for splitting
(see String-Manipulation Functions).
split()
function,
which is an array to hold the values of the separators
(see String-Manipulation Functions).
BEGINFILE
and ENDFILE
special patterns
(see The BEGINFILE
and ENDFILE
Special Patterns).
switch
/ case
are enabled by default
(see The switch
Statement).
gawk
from treating input as a multibyte string.
gawk
internals were rewritten, bringing the dgawk
debugger and possibly improved performance
(see Debugging awk
Programs).
strcoll()
/ wcscoll()
(see String Comparison Based on Locale Collating Order).
gawk
for Network Programming).
Version 4.1 of gawk
introduced the following features:
SYMTAB
, FUNCTAB
, and PROCINFO["identifiers"]
(see Built-in Variables That Convey Information).
gawk
, pgawk
, and dgawk
, were merged into
one, named just gawk
. As a result the command-line options changed.
awk
library files.
gawk
).
and()
, or()
and xor()
functions
changed to allow any number of arguments,
with a minimum of two
(see Bit-Manipulation Functions).
gawk
).
getline
became allowed inside
BEGINFILE
and ENDFILE
(see The BEGINFILE
and ENDFILE
Special Patterns).
where
command was added to the debugger
(see Working with the Stack).
Version 4.2 of gawk
introduced the following changes:
ENVIRON
are reflected into gawk
’s
environment and that of programs that it runs.
See Built-in Variables That Convey Information.
FIELDWIDTHS
was enhanced to allow skipping characters
before assigning a value to a field
(see Defining Fields by Content).
PROCINFO["argv"]
array.
See Built-in Variables That Convey Information.
mktime()
function now accepts an optional
second argument
(see Time Functions).
typeof()
function (see Getting Type Information).
gawk
behaved with --posix. As of 2013,
the standard restored historical behavior, and now default
field splitting with --posix also allows newlines to
separate fields.
print
and printf
.
See Enabling Nonfatal Output.
PROCINFO[input-file, "RETRY"]
;
(see Retrying Reads After Certain Input Errors).
awk
Programs):
awk
program too.
gawk
):
get_file()
function to access open redirections.
nonfatal()
function for generating nonfatal error messages.
igawk
program and its manual page are no longer
installed when gawk
is built.
See An Easy Way to Use Library Functions.
Version 5.0 added the following features:
PROCINFO["platform"]
array element, which allows you
to write code that takes the operating system / platform into account.
Version 5.1 was created to release gawk
with a correct
major version number for the API. This was overlooked for version 5.0,
unfortunately. It added the following features:
asort()
and asorti()
were changed to
allow FUNCTAB
and SYMTAB
as the first argument if a
second destination array is supplied (see String-Manipulation Functions).
$0
and the fields are now cleared before starting a
BEGINFILE
rule (see The BEGINFILE
and ENDFILE
Special Patterns).
Version 5.2 added the following features:
mkbool()
built-in function
(see Generating Boolean Values).
gawkbug
script for reporting bugs
(see Submitting Bug Reports).
PROCINFO["pma"]
exists if the PMA allocator is compiled
in (see Built-in Variables That Convey Information).
Version 5.3 added the following features:
PROCINFO["CSV"]
exists if gawk
was invoked
with --csv (see Built-in Variables That Convey Information).
do_csv
API information variable
(see Informational Variables).
gawk
buffer output to pipes
(see Speeding Up Pipe Output).
libsigsegv
was removed from gawk
.
The value-add was never very much and it caused problems in some
environments.