awk
programs are commonly used to process log files
containing timestamp information, indicating when a
particular log record was written. Many programs log their timestamps
in the form returned by the time()
system call, which is the
number of seconds since a particular epoch. On POSIX-compliant systems,
it is the number of seconds since
1970-01-01 00:00:00 UTC, not counting leap
seconds.56
All known POSIX-compliant systems support timestamps from 0 through
231 − 1,
which is sufficient to represent times through
2038-01-19 03:14:07 UTC. Many systems support a wider range of timestamps,
including negative timestamps that represent times before the
epoch.
In order to make it easier to process such log files and to produce
useful reports, gawk
provides the following functions for
working with timestamps. They are gawk
extensions; they are
not specified in the POSIX standard.57
However, recent versions
of mawk
(see Other Freely Available awk
Implementations) also support these functions.
Optional parameters are enclosed in square brackets ([ ]):
mktime(datespec
[, utc-flag
])
¶Turn datespec into a timestamp in the same form
as is returned by systime()
. It is similar to the function of the
same name in ISO C. The argument, datespec, is a string of the form
"YYYY MM DD HH MM SS [DST]"
.
The string consists of six or seven numbers representing, respectively,
the full year including century, the month from 1 to 12, the day of the month
from 1 to 31, the hour of the day from 0 to 23, the minute from 0 to
59, the second from 0 to 60,58
and an optional daylight-savings flag.
The values of these numbers need not be within the ranges specified;
for example, an hour of −1 means 1 hour before midnight.
The origin-zero Gregorian calendar is assumed, with year 0 preceding
year 1 and year −1 preceding year 0.
If utc-flag is present and is either nonzero or non-null, the time
is assumed to be in the UTC time zone; otherwise, the
time is assumed to be in the local time zone.
If the DST daylight-savings flag is positive, the time is assumed to be
daylight savings time; if zero, the time is assumed to be standard
time; and if negative (the default), mktime()
attempts to determine
whether daylight savings time is in effect for the specified time.
If datespec does not contain enough elements or if the resulting time
is out of range, mktime()
returns −1.
strftime(
[format [,
timestamp [,
utc-flag] ] ])
¶Format the time specified by timestamp
based on the contents of the format string and return the result.
It is similar to the function of the same name in ISO C.
If utc-flag is present and is either nonzero or non-null, the value
is formatted as UTC (Coordinated Universal Time, formerly GMT or Greenwich
Mean Time). Otherwise, the value is formatted for the local time zone.
The timestamp is in the same format as the value returned by the
systime()
function. If no timestamp argument is supplied,
gawk
uses the current time of day as the timestamp.
Without a format argument, strftime()
uses
the value of PROCINFO["strftime"]
as the format string
(see Predefined Variables).
The default string value is
"%a %b %e %H:%M:%S %Z %Y"
. This format string produces
output that is equivalent to that of the date
utility.
You can assign a new value to PROCINFO["strftime"]
to
change the default format; see the following list for the various format directives.
systime()
¶Return the current time as the number of seconds since the system epoch. On POSIX systems, this is the number of seconds since 1970-01-01 00:00:00 UTC, not counting leap seconds. It may be a different number on other systems.
The systime()
function allows you to compare a timestamp from a
log file with the current time of day. In particular, it is easy to
determine how long ago a particular record was logged. It also allows
you to produce log records using the “seconds since the epoch” format.
The mktime()
function allows you to convert a textual representation
of a date and time into a timestamp. This makes it easy to do before/after
comparisons of dates and times, particularly when dealing with date and
time data coming from an external source, such as a log file.
The strftime()
function allows you to easily turn a timestamp
into human-readable information. It is similar in nature to the sprintf()
function
(see String-Manipulation Functions),
in that it copies nonformat specification characters verbatim to the
returned string, while substituting date and time values for format
specifications in the format string.
strftime()
is guaranteed by the 1999 ISO C
standard59
to support the following date format specifications:
%a
The locale’s abbreviated weekday name.
%A
The locale’s full weekday name.
%b
The locale’s abbreviated month name.
%B
The locale’s full month name.
%c
The locale’s “appropriate” date and time representation.
(This is ‘%A %B %d %T %Y’ in the "C"
locale.)
%C
The century part of the current year. This is the year divided by 100 and truncated to the next lower integer.
%d
The day of the month as a decimal number (01–31).
%D
Equivalent to specifying ‘%m/%d/%y’.
%e
The day of the month, padded with a space if it is only one digit.
%F
Equivalent to specifying ‘%Y-%m-%d’. This is the ISO 8601 date format.
%g
The year modulo 100 of the ISO 8601 week number, as a decimal number (00–99). For example, January 1, 2012, is in week 53 of 2011. Thus, the year of its ISO 8601 week number is 2011, even though its year is 2012. Similarly, December 31, 2012, is in week 1 of 2013. Thus, the year of its ISO week number is 2013, even though its year is 2012.
%G
The full year of the ISO week number, as a decimal number.
%h
Equivalent to ‘%b’.
%H
The hour (24-hour clock) as a decimal number (00–23).
%I
The hour (12-hour clock) as a decimal number (01–12).
%j
The day of the year as a decimal number (001–366).
%m
The month as a decimal number (01–12).
%M
The minute as a decimal number (00–59).
%n
A newline character (ASCII LF).
%p
The locale’s equivalent of the AM/PM designations associated with a 12-hour clock.
%r
The locale’s 12-hour clock time.
(This is ‘%I:%M:%S %p’ in the "C"
locale.)
%R
Equivalent to specifying ‘%H:%M’.
%S
The second as a decimal number (00–60).
%t
A TAB character.
%T
Equivalent to specifying ‘%H:%M:%S’.
%u
The weekday as a decimal number (1–7). Monday is day one.
%U
The week number of the year (with the first Sunday as the first day of week one) as a decimal number (00–53).
%V
The week number of the year (with the first Monday as the first day of week one) as a decimal number (01–53). The method for determining the week number is as specified by ISO 8601. (To wit: if the week containing January 1 has four or more days in the new year, then it is week one; otherwise it is the last week [52 or 53] of the previous year and the next week is week one.)
%w
The weekday as a decimal number (0–6). Sunday is day zero.
%W
The week number of the year (with the first Monday as the first day of week one) as a decimal number (00–53).
%x
The locale’s “appropriate” date representation.
(This is ‘%A %B %d %Y’ in the "C"
locale.)
%X
The locale’s “appropriate” time representation.
(This is ‘%T’ in the "C"
locale.)
%y
The year modulo 100 as a decimal number (00–99).
%Y
The full year as a decimal number (e.g., 2015).
%z
The time zone offset in a ‘+HHMM’ format (e.g., the format necessary to produce RFC 822/RFC 1036 date headers).
%Z
The time zone name or abbreviation; no characters if no time zone is determinable.
%Ec %EC %Ex %EX %Ey %EY %Od %Oe %OH
%OI %Om %OM %OS %Ou %OU %OV %Ow %OW %Oy
“Alternative representations” for the specifications
that use only the second letter (‘%c’, ‘%C’,
and so on).60
(These facilitate compliance with the POSIX date
utility.)
%%
A literal ‘%’.
If a conversion specifier is not one of those just listed, the behavior is undefined.61
For systems that are not yet fully standards-compliant,
gawk
supplies a copy of
strftime()
from the GNU C Library.
It supports all of the just-listed format specifications.
If that version is
used to compile gawk
(see Installing gawk
),
then the following additional format specifications are available:
%k
The hour (24-hour clock) as a decimal number (0–23). Single-digit numbers are padded with a space.
%l
The hour (12-hour clock) as a decimal number (1–12). Single-digit numbers are padded with a space.
%s
The time as a decimal timestamp in seconds since the epoch.
Additionally, the alternative representations are recognized but their normal representations are used.
NOTE: Similar to
printf()
, some versions ofstrftime()
support the use of flags between the%
and the format specification letter. Check your local man page forstrftime()
to see if it supports flags or not, and what they are. Be aware, however, that using any such flags is likely to make your script less portable to other systems.
The following example is an awk
implementation of the POSIX
date
utility. Normally, the date
utility prints the
current date and time of day in a well-known format. However, if you
provide an argument to it that begins with a ‘+’, date
copies nonformat specifier characters to the standard output and
interprets the current time according to the format specifiers in
the string. For example:
$ date '+Today is %A, %B %d, %Y.' -| Today is Monday, September 22, 2014.
Here is the gawk
version of the date
utility.
It has a shell “wrapper” to handle the -u option,
which requires that date
run as if the time zone
is set to UTC:
#! /bin/sh # # date --- approximate the POSIX 'date' command case $1 in -u) TZ=UTC0 # use UTC export TZ shift ;; esac gawk 'BEGIN { format = PROCINFO["strftime"] exitval = 0 if (ARGC > 2) exitval = 1 else if (ARGC == 2) { format = ARGV[1] if (format ~ /^\+/) format = substr(format, 2) # remove leading + } print strftime(format) exit exitval }' "$@"
This script was written before the strftime()
function acquired
its third argument, utc-flag. Consider how you might modify the
program to work entirely in awk
and process the -u option
for printing the time in UTC.
See Glossary, especially the entries “Epoch” and “UTC.”
The GNU date
utility can
also do many of the things described here. Its use may be preferable
for simple time-related operations in shell scripts.
Occasionally there are minutes in a year with a leap second, which is why the seconds can go up to 60.
Unfortunately,
not every system’s strftime()
necessarily
supports all of the conversions listed here.
If you don’t understand any of this, don’t worry about
it; these facilities are meant to make it easier to “internationalize”
programs.
Other internationalization features are described in
Internationalization with gawk
.
This is because ISO C leaves the
behavior of the C version of strftime()
undefined and gawk
uses the system’s version of strftime()
if it’s there.
Typically, the conversion specifier either does not appear in the
returned string or appears literally.