If you think that gawk
is too slow at doing a particular task,
you should investigate before sending in a bug report. Here are the steps
to follow:
gawk
with the --profile option (see Command-Line Options)
to see what your
program is doing. It may be that you have written it in an inefficient manner.
For example, you may be doing something for every record that could be done
just once, for every file.
(Use a BEGINFILE
rule; see The BEGINFILE
and ENDFILE
Special Patterns.)
Or you may be doing something for every file that only needs to be done
once per run of the program.
(Use a BEGIN
rule; see The BEGIN
and END
Special Patterns.)
awk
level doesn’t help, then you will
need to compile gawk
itself for profiling at the C language level.
To do that, start with the latest released version of
gawk
. Unpack the source code in a new directory, and configure
it:
$ tar -xpzvf gawk-X.Y.Z.tar.gz ... Output omitted $ cd gawk-X.Y.Z $ ./configure ... Output omitted
gawk
to be compiled for profiling.
make
command:
$ make ... Output omitted
gawk
on a real program,
using real data. Using an artificial program to try to time one
particular feature of gawk
is useless; real awk
programs
generally spend most of their time doing I/O, not computing. If you want to prove
that something is slow, it must be done using a real program and real data.
Use a data file that is large enough for the statistical profiling to measure
where gawk
spends its time. It should be at least 100 megabytes in size.
$ ./gawk -f realprogram.awk realdata > /dev/null
Preferably, you should also submit the program and the data, or else indicate where to get the data if the file is large.
If you are incapable or unwilling to do the steps listed above, then you will
just have to live with gawk
as it is.