In the cold world outside of Guile, not all strings are treated in the same way. Out there there are only bytes, and there are many ways of representing a strings (sequences of characters) as binary data (sequences of bytes).
As a user, usually you don’t have to think about this very much. When you type on your keyboard, your system encodes your keystrokes as bytes according to the locale that you have configured on your computer. Guile uses the locale to decode those bytes back into characters – hopefully the same characters that you typed in.
All is not so clear when dealing with a system with multiple users, such as a web server. Your web server might get a request from one user for data encoded in the ISO-8859-1 character set, and then another request from a different user for UTF-8 data.
Guile provides an iconv module for converting between strings and sequences of bytes. See Bytevectors, for more on how Guile represents raw byte sequences. This module gets its name from the common UNIX command of the same name.
Note that often it is sufficient to just read and write strings from
ports instead of using these functions. To do this, specify the port
encoding using set-port-encoding!
. See Ports, for more on
ports and character encodings.
Unlike the rest of the procedures in this section, you have to load the
iconv
module before having access to these procedures:
(use-modules (ice-9 iconv))
Encode string as a sequence of bytes.
The string will be encoded in the character set specified by the
encoding string. If the string has characters that cannot be
represented in the encoding, by default this procedure raises an
encoding-error
. Pass a conversion-strategy argument to
specify other behaviors.
The return value is a bytevector. See Bytevectors, for more on bytevectors. See Ports, for more on character encodings and conversion strategies.
Decode bytevector into a string.
The bytes will be decoded from the character set by the encoding
string. If the bytes do not form a valid encoding, by default this
procedure raises an decoding-error
. As with
string->bytevector
, pass the optional conversion-strategy
argument to modify this behavior. See Ports, for more on character
encodings and conversion strategies.
Like call-with-output-string
, but instead of returning a string,
returns a encoding of the string according to encoding, as a
bytevector. This procedure can be more efficient than collecting a
string and then converting it via string->bytevector
.