Texinfo::Convert::Unicode - Representation as Unicode characters
use Texinfo::Convert::Unicode qw(unicode_accent encoded_accents unicode_text); use Texinfo::Convert::Text qw(convert_to_text); my ($innermost_contents, $stack) = Texinfo::Convert::Utils::find_innermost_accent_contents($accent); my $formatted_accents = encoded_accents ($converter, convert_to_text($innermost_contents), $stack, $encoding, \&Texinfo::Text::ascii_accent_fallback); my $accent_text = unicode_accent('e', $accent_command);
The Texinfo Perl module main purpose is to be used in texi2any
to convert
Texinfo to other formats. There is no promise of API stability.
Texinfo::Convert::Unicode
provides methods dealing with Unicode representation
and conversion of Unicode code points, to be used in converters.
When an encoding supported in Texinfo is given as argument of a method of the module, the accented letters or characters returned by the method should only be represented by Unicode code points if it is known that Perl should manage to convert the Unicode code points to encoded characters in the encoding character set. Note that the actual conversion is done by Perl, not by the module.
Return the Unicode representation of a command with brace and no argument
$command_name (like @bullet{}
, @aa{}
or @guilsinglleft{}
),
or undef
if the Unicode representation cannot be converted to encoding
$encoding.
Check that it is possible to output actual UTF-8 binary bytes
corresponding to the Unicode code point string $arg (such as
201D
). Perl gives a warning and will not output UTF-8 for
Unicode non-characters such as U+10FFFF. If the optional
$output_debug argument is set, a debugging output warning
is emitted if the test of the conversion failed.
Returns 1 if the conversion is possible and can be attempted,
0 otherwise.
$encoding is the encoding the accented characters should be encoded to. If
$encoding not set, $result is set to undef
. Nested accents and
their content are passed with $text and $stack. $text is the text
appearing within nested accent commands. $stack is an array reference
holding the nested accents texinfo tree elements. In general, $text is
the formatted contents and $stack the stack returned by
Texinfo::Convert::Utils::find_innermost_accent_contents. The function
tries to convert as much as possible the accents to $encoding starting from the
innermost accent.
$format_accent is a function reference that is used to format the accent
commands if there is no encoded character available at some point of the
conversion of the $stack. $converter is a converter object optionaly
used by $format_accent. It may be undef
if there is no need of
converter object in $format_accent.
If $set_case is positive, the result is upper-cased, while if it is negative, the result is lower-cased.
Return the string width, taking into account the fact that some characters have a zero width (like composing accents) while some have a width of 2 (most chinese characters, for example).
$text is the text appearing within an accent command. $accent_command should be a Texinfo tree element corresponding to an accent command taking an argument. The function returns the Unicode representation of the accented character.
Return true if the $unicode_point will be encoded in the encoding
$encoding. The $unicode_point should be specified as a four letter
string describing an hexadecimal number with letters in upper case
(such as 201D
). Tables are used to determine if the $unicode_point
will be encoded, when the encoding does not cover the whole Unicode range.
If the encoding is not supported in Texinfo, the result will always be false.
Return $text with dashes and quotes corresponding, for example to ---
or
'
, represented as Unicode code points. If $in_code is set, the text is
considered to be in code style.
Copyright 2010- Free Software Foundation, Inc. See the source file for all copyright years.
This library is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.