13.5 A Simple Internationalization Example

Now let’s look at a step-by-step example of how to internationalize and localize a simple awk program, using guide.awk as our original source:

BEGIN {
    TEXTDOMAIN = "guide"
    bindtextdomain(".")  # for testing
    print _"Don't Panic"
    print _"The Answer Is", 42
    print "Pardon me, Zaphod who?"
}

Run ‘gawk --gen-pot’ to create the .pot file:

$ gawk --gen-pot -f guide.awk > guide.pot

This produces:

#: guide.awk:4
msgid "Don't Panic"
msgstr ""

#: guide.awk:5
msgid "The Answer Is"
msgstr ""

This original portable object template file is saved and reused for each language into which the application is translated. The msgid is the original string and the msgstr is the translation.

NOTE: Strings not marked with a leading underscore do not appear in the guide.pot file.

Next, the messages must be translated. Here is a translation to a hypothetical dialect of English, called “Mellow”:96

$ cp guide.pot guide-mellow.po
Add translations to guide-mellow.po ...

Following are the translations:

#: guide.awk:4
msgid "Don't Panic"
msgstr "Hey man, relax!"

#: guide.awk:5
msgid "The Answer Is"
msgstr "Like, the scoop is"

NOTE: The following instructions apply to GNU/Linux with the GNU C Library. Be aware that the actual steps may change over time, that the following description may not be accurate for all GNU/Linux distributions, and that things may work entirely differently on other operating systems.

The next step is to make the directory to hold the binary message object file and then to create the guide.mo file. The directory has the form locale/LC_MESSAGES, where locale is a locale name known to the C gettext routines.

How do we know which locale to use? It turns out that there are four different environment variables used by the C gettext routines. In order, they are $LANGUAGE, $LC_ALL, $LANG, and $LC_MESSAGES.97 Thus, we check the value of $LANGUAGE:

$ echo $LANGUAGE
-| en_US.UTF-8

We next make the directories:

$ mkdir en_US.UTF-8 en_US.UTF-8/LC_MESSAGES

The msgfmt utility converts the human-readable .po file into a machine-readable .mo file. By default, msgfmt creates a file named messages. This file must be renamed and placed in the proper directory (using the -o option) so that gawk can find it:

$ msgfmt guide-mellow.po -o en_US.UTF-8/LC_MESSAGES/guide.mo

Finally, we run the program to test it:

$ gawk -f guide.awk
-| Hey man, relax!
-| Like, the scoop is 42
-| Pardon me, Zaphod who?

If the three replacement functions for dcgettext(), dcngettext(), and bindtextdomain() (see awk Portability Issues) are in a file named libintl.awk, then we can run guide.awk unchanged as follows:

$ gawk --posix -f guide.awk -f libintl.awk
-| Don't Panic
-| The Answer Is 42
-| Pardon me, Zaphod who?

Footnotes

(96)

Perhaps it would be better if it were called “Hippy.” Ah, well.

(97)

Well, sort of. It seems that if $LC_ALL is set to ‘C’, then no translations are done. Go figure.