gawk
’s --csv option causes gawk
to process CSV data (see Working With Comma Separated Value Files).
But what if you have regular data that you want to output in CSV format? This section provides functions for doing that.
The first function, tocsv()
, takes an array of data
fields as input. The array should be indexed starting from one.
The optional second parameter is the separator to use. If none
is supplied, the default is a comma.
The function takes care to quote fields that contain double quotes, newlines, or the separator character. It then builds up the final CSV record and returns it.
# tocsv.awk --- convert data to CSV format function tocsv(fields, sep, i, j, nfields, result) { if (length(fields) == 0) return "" if (sep == "") sep = "," delete nfields for (i = 1; i in fields; i++) { nfields[i] = fields[i] if (nfields[i] ~ /["\n]/ || index(nfields[i], sep) != 0) { gsub(/"/, "\"\"", nfields[i]) # double up the double quotes nfields[i] = "\"" nfields[i] "\"" # wrap in double quotes } } result = nfields[1] j = length(nfields) for (i = 2; i <= j; i++) result = result sep nfields[i] return result }
The next function, tocsv_rec()
is a wrapper around
tocsv()
. Its intended use is for when you want to convert the
current input record to CSV format. The function itself simply copies
the fields into an array to pass to tocsv()
which does the work.
It accepts an optional separator character as its first parameter,
which it simply passes on to tocsv()
.
function tocsv_rec(sep, i, fields) { delete fields for (i = 1; i <= NF; i++) fields[i] = $i return tocsv(fields, sep) }