General Escape Syntax (GNU Emacs Lisp Reference Manual)

Next: Control-Character Syntax, Previous: Basic Char Syntax, Up: Character Type [Contents][Index]

2.4.3.2 General Escape Syntax

In addition to the specific escape sequences for special important control characters, Emacs provides several types of escape syntax that you can use to specify non-ASCII text characters.

You can specify characters by their Unicode names, if any. ?\N{NAME} represents the Unicode character named NAME. Thus, ‘?\N{LATIN SMALL LETTER A WITH GRAVE}’ is equivalent to ?à and denotes the Unicode character U+00E0. To simplify entering multi-line strings, you can replace spaces in the names by non-empty sequences of whitespace (e.g., newlines).
You can specify characters by their Unicode values. ?\N{U+X} represents a character with Unicode code point X, where X is a hexadecimal number. Also, ?\uxxxx and ?\Uxxxxxxxx represent code points xxxx and xxxxxxxx, respectively, where each x is a single hexadecimal digit. For example, ?\N{U+E0}, ?\u00e0 and ?\U000000E0 are all equivalent to ?à and to ‘?\N{LATIN SMALL LETTER A WITH GRAVE}’. The Unicode Standard defines code points only up to ‘U+10ffff’, so if you specify a code point higher than that, Emacs signals an error.
You can specify characters by their hexadecimal character codes. A hexadecimal escape sequence consists of a backslash, ‘x’, and the hexadecimal character code. Thus, ‘?\x41’ is the character A, ‘?\x1’ is the character C-a, and ?\xe0 is the character à (a with grave accent). You can use one or more hex digits after ‘x’, so you can represent any character code in this way.
You can specify characters by their character code in octal. An octal escape sequence consists of a backslash followed by up to three octal digits; thus, ‘?\101’ for the character A, ‘?\001’ for the character C-a, and ?\002 for the character C-b. Only characters up to octal code 777 can be specified this way.

These escape sequences may also be used in strings. See Non-ASCII Characters in Strings.