Next: ‘char *’ strings, Previous: Locale encodings, Up: Introduction [Contents][Index]
There are three ways of representing strings in memory of a running program.
wchar_t
mess.
Of course, a ‘char *’ string can, in some cases, be encoded in UTF-8.
You will use the data type depending on what you can guarantee about how
it’s encoded: If a string is encoded in the locale encoding, or if you
don’t know how it’s encoded, use ‘char *’. If, on the other hand,
you can guarantee that it is UTF-8 encoded, then you can use the
UTF-8 string type, uint8_t *
, for it.
The five types char *
, uint8_t *
, uint16_t *
,
uint32_t *
, and wchar_t *
are incompatible types at the C
level. Therefore, ‘gcc -Wall’ will produce a warning if, by mistake,
your code contains a mismatch between these types. In the context of
using GNU libunistring, even a warning about a mismatch between
char *
and uint8_t *
is a sign of a bug in your code
that you should not try to silence through a cast.
Next: ‘char *’ strings, Previous: Locale encodings, Up: Introduction [Contents][Index]