GNU Source-highlight, given a source file, produces a document with syntax highlighting.
This is Edition 3.0 of the Source-highlight Library manual.
This file documents GNU Source-highlight Library version 3.0.
This manual is for GNU Source-highlight Library (version 3.0, 10 May 2009), which given a source file, produces a document with syntax highlighting.
Copyright © 2005-2008 Lorenzo Bettini, http://www.lorenzobettini.it.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with the Front-Cover Texts being “A GNU Manual,” and with the Back-Cover Texts as in (a) below. A copy of the license is included in the section entitled “GNU Free Documentation License.”(a) The FSF's Back-Cover Text is: “You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development.”
GNU Source-highlight, given a source file, produces a document with syntax highlighting. see Introduction for a wider introduction about GNU Source-highlight.
This file documents the Library provided by GNU Source-highlight, thus its audience is programmers only, who want to use source-highlight features inside their programs, not the users of Source-highlight. This library is part of GNU Source-highlight since version 3.0.
However, the main principles of GNU Source-highlight will be given for granted, together with all the notions for writing language definition files, output definition files, and so on. Again, we refer to the documentation of GNU Source-highlight for all these features.
GNU Source-highlight library is part of GNU Source-highlight, thus it will be installed together with Source-highlight itself; we refer to see Installation for further instructions on installing GNU Source-highlight. Here we detail only the parts concerning the library.
If you want to build and install the API documentation of
Source-highlight library, you need to run configure
with the
option --with-doxygen
, but you need the program Doxygen,
http://www.doxygen.org, to build the documentation.
The documentation will be installed in the following directory:
Library API documentation
prefix/share/doc/source-highlight/api
library examples
prefix/share/doc/source-highlight/examples
You can use GNU Source-highlight library in your programs, by including its headers and linking to the file libsource-highlight.ext1.
All the classes of the library are part of the namespace
srchilite
, and all the header files are in the subdirectory
srchilite
.
The easiest way to use GNU Source-highlight library in your program is
to rely on autotools, i.e., Automake, Autoconf, etc. In
particular, the library is installed with a
pkg-config
2
configuration file (metadata file), source-highlight.pc.
pkg-config is a tool for helping compiling applications and libraries. It helps you insert the correct compiler options on the command line so an application can use Source-highlight library simply by running
gcc -o test test.c `pkg-config --libs --cflags source-highlight`
rather than hard-coding values on where to find the library. Moreover, this will provide also with the correct compiler flags and libraries used by Source-highlight library itself, e.g., Boost Regex library.
Note that pkg-config
searches for .pc files in its
standard directories. If you installed the library in a non standard
directory, you'll need to set the PKG_CONFIG_PATH
environment
variable accordingly.
For instance, if I install the library into
/usr/local/lib
, the .pc file will be installed into
/usr/local/lib/pkgconfig
, and then I'll need to call
pkg-config
as follows:
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig \ pkg-config --libs --cflags source-highlight
In your configure.ac you can use the autoconf macro provided
by pkg-config
; here is an example:
# Checks for libraries. PKG_CHECK_MODULES(SRCHILITE, [source-highlight >= 3.0]) AC_SUBST(SRCHILITE_CFLAGS) AC_SUBST(SRCHILITE_LIBS)
Then, you can use the variables SRCHILITE_CFLAGS
and
SRCHILITE_LIBS
in your makefiles accordingly.
For instance,
... AM_CPPFLAGS = $(SRCHILITE_CFLAGS) ... LDADD = $(SRCHILITE_LIBS) ...
Here we present the main classes of the Source-highlight library, together with some example of use. For the documentation of all the classes (and methods of the classes) we refer to the generated API documentation (see See Installation).
You will note that often, methods and constructors of the
classes of the libraries do not take a pointer or a reference
to a class, say MyClass
, but an object of type MyClassPtr
;
these are
shared pointers, in particular the ones provided by the Boost
libraries (they are typedefs using, e.g.,
boost::shared_ptr<MyClass>
). This will avoid dangerous dangling
pointers and possible memory leaks in the library.
If on the contrary, a method or a constructor in a class of the library
takes a standard pointer, say MyClass *
, then that class will
NEVER delete such pointer. It is up to the actual owner the object of
MyClass *
to delete the object when it is not needed anymore.
The classes of the libraries can raise exceptions if errors are
encountered (e.g., an input file cannot be opened, or a language
definition file cannot be parsed); the exception classes can be found in
the API documentation, and all exception classes inherit from
std::exception
class.
The SourceHighlight
class is the class of the library that basically
implements all the functionalities used by the program
source-highlight
itself; thus it highlights an input file generating
an output file. It can be configured with many options, and basically
it has a get/set methods for all the command line options of
source-highlight
(we refer also to see Invoking source-highlight).
For instance, the following example (source-highlight-console-main.cpp) highlights an input file to the console (the colors are obtained through ANSI color escape sequences (so you need a console program that supports this):
#include <iostream> #include "srchilite/sourcehighlight.h" #include "srchilite/langmap.h" using namespace std; #ifndef DATADIR #define DATADIR "" #endif int main(int argc, char *argv[]) { // we highlight to the console, through ANSI escape sequences srchilite::SourceHighlight sourceHighlight("esc.outlang"); // make sure we find the .lang and .outlang files sourceHighlight.setDataDir(DATADIR); // by default we highlight C++ code string inputLang = "cpp.lang"; if (argc > 1) { // we have a file name so we detect the input source language srchilite::LangMap langMap(DATADIR, "lang.map"); string lang = langMap.getMappedFileNameFromFileName(argv[1]); if (lang != "") { inputLang = lang; } // otherwise we default to C++ // output file name is empty => cout sourceHighlight.highlight(argv[1], "", inputLang); } else { // input file name is empty => cin sourceHighlight.highlight("", "", inputLang); } return 0; }
Note that if a file name is passed at the command line, the program
tries to detect the source language by using a LangMap
class
object, specifying the map file lang.map, which is the one
mapping file extensions to language definition files (e.g., if the file
name has extension .java it will use the corresponding
java.lang). Otherwise we assume that we want to highlight
a C++ file.
All the highlighting is performed by the highlight
method; since
we don't specify an output file name it will output the highlighted
result directly to the console. In case we don't have an input filename
either, highlight
method will read from the standard input. Since
the highlighting takes place one line per time, you can test the program
this way: you'll enter a line on the console and when you press enter,
the program will echo the same line highlighted.
The DATADIR
is not even mandatory, provided you installed
Source-highlight correctly, or that you set it up, using
source-highlight-settings
program.
The formatting of Source-highlight library, i.e., how to actually perform the highlighting, or what to do when we need to highlight something, can be completely customized; the library detects (using regular expressions based on language definition files) that something must be highlighted as, say, a keyword, and you can then do whatever you want with this information. The default formatting strategy is to output an highlighted text using a specific formatting format, but you're free to do whatever you like, if you want.
This formatting abstraction is done through Formatter
class, which
basically declares only the abstract method format
method which
takes as parameters the string to format, and further (possibly empty)
additional parameters, implemented by FormatterParams
class. Note
that the format
method does not get as an argument how the passed
string must be formatted (e.g., as a keyword, as a type, etc.); this
information must be stored in the formatter from the start. Indeed, the
mapping between a language element and a formatter is performed by
FormatterManager
class. An object of this class must be created
by specifying a default formatter object, that will be used when the
formatter manager will be queried for a formatter for a specific
language element that it is not able to handle (in this it will fall
back by returning the default formatter).
For instance, this is a customized formatter (infoformatter.h)
which, when requested to format a string, it simply writes this
information specifying which kind of language element it is, and the
position in the line (the start
field in
FormatterParams
class). Note that the language element is stored
in a field of the class, and it is set at object creation time. We
avoid to write anything if we are requested to format something as
"normal"
, or if the string to format is empty.
class InfoFormatter: public srchilite::Formatter { /// the language element represented by this formatter std::string elem; public: InfoFormatter(const std::string &elem_ = "normal") : elem(elem_) { } virtual void format(const std::string &s, const srchilite::FormatterParams *params = 0) { // do not print anything if normal or string to format is empty if (elem != "normal" || !s.size()) { std::cout << elem << ": " << s; if (params) std::cout << ", start: " << params->start; std::cout << std::endl; } } }; /// shared pointer for InfoFormatter typedef boost::shared_ptr<InfoFormatter> InfoFormatterPtr;
For convenience we also declare a typedef for the shared pointer (since the formatter manager takes only shared pointers to formatters).
In order to customize the formatting, there are some more steps
to do, and in particular, you cannot use SourceHighlight
class anymore
but you need to use more classes.
First of all, you need LangDefManager
class which takes care of
building the regular expressions starting from a language definition
file; in order to do this it uses a HighlightRuleFactory
class
object; for the moment, only the implementation based on boost regular
expression exists, so you can simply pass an object of
RegexRuleFactory
class. Once you have an object of
LangDefManager
class, you can use the
getHighlightState
method to build the automaton to perform the
highlight (in particular the initial state of such automaton, of
HighlightState
class), and you should pass this to an object that
can use the automaton to perform the highlighting. To do this, you can
use SourceHighlighter
class whose objects can be used to highlight
a line of text, using highlightParagraph
method.
You can then create a FormatterManager
class object and populate
it with your formatters and set it to the SourceHighlighter
class
object. The following example (infoformatter-main.cpp) shows how
to perform these steps; note that we can share the same formatter for
different language elements:
#include <iostream> #include "srchilite/langdefmanager.h" #include "srchilite/regexrulefactory.h" #include "srchilite/sourcehighlighter.h" #include "srchilite/formattermanager.h" #include "infoformatter.h" using namespace std; #ifndef DATADIR #define DATADIR "" #endif int main() { srchilite::RegexRuleFactory ruleFactory; srchilite::LangDefManager langDefManager(&ruleFactory); // we highlight C++ code for simplicity srchilite::SourceHighlighter highlighter(langDefManager.getHighlightState( DATADIR, "cpp.lang")); srchilite::FormatterManager formatterManager(InfoFormatterPtr( new InfoFormatter)); InfoFormatterPtr keywordFormatter(new InfoFormatter("keyword")); formatterManager.addFormatter("keyword", keywordFormatter); formatterManager.addFormatter("string", InfoFormatterPtr(new InfoFormatter( "string"))); // for "type" we use the same formatter as for "keyword" formatterManager.addFormatter("type", keywordFormatter); formatterManager.addFormatter("comment", InfoFormatterPtr( new InfoFormatter("comment"))); formatterManager.addFormatter("symbol", InfoFormatterPtr(new InfoFormatter( "symbol"))); formatterManager.addFormatter("number", InfoFormatterPtr(new InfoFormatter( "number"))); formatterManager.addFormatter("preproc", InfoFormatterPtr( new InfoFormatter("preproc"))); highlighter.setFormatterManager(&formatterManager); // make sure it uses additional information srchilite::FormatterParams params; highlighter.setFormatterParams(¶ms); string line; // we now highlight a line a time while (getline(cin, line)) { // reset position counter within a line params.start = 0; highlighter.highlightParagraph(line); } return 0; }
Note that, since we highlight a line a time, we must reset the
start
field each time we start to examine a new line.
For simplicity this example highlights only C++ code and reads directly from the standard input and writes to the standard output. This is a run of the example reading from the standard input (so each time you insert a line you get the output of your formatters):
// this is a comment comment: //, start: 0 comment: this is a comment, start: 2 #include <foobar.h> preproc: #include, start: 0 string: <foobar.h>, start: 9 int abc = 100 + 5; keyword: int, start: 0 symbol: =, start: 8 number: 100, start: 10 symbol: +, start: 14 number: 5, start: 16 symbol: ;, start: 17
During the highlighting (and regular expression matching) the library
generates events that can be “listened” by using a customized event
listener. An event is represented by an object of
HighlightEvent
class, which stores the HighlightToken
class
object and the type (an HighlightEventType
enum) of the event.
A customized listener can be implemented by deriving from
HighlightEventListener
class and by defining the virtual method
notify
method, which, of course, takes an
HighlightEvent
class object as parameter.
For instance, source-highlight
implements the debugging
functionalities by using a customized listener,
DebugListener
class, whose method implementation we report here as
an example:
void DebugListener::notify(const HighlightEvent &event) { switch (event.type) { case HighlightEvent::FORMAT: // print information about the rule if (event.token.rule) { os << event.token.rule->getAdditionalInfo() << endl; os << "expression: \"" << event.token.rule->toString() << "\"" << endl; } // now format the matched strings for (MatchedElements::const_iterator it = event.token.matched.begin(); it != event.token.matched.end(); ++it) { os << "formatting \"" << it->second << "\" as " << it->first << endl; } step(); break; case HighlightEvent::FORMATDEFAULT: os << "formatting \"" << event.token.matched.front().second << "\" as default" << endl; step(); break; case HighlightEvent::ENTERSTATE: os << "entering state: " << event.token.rule->getNextState()->getId() << endl; break; case HighlightEvent::EXITSTATE: int level = event.token.rule->getExitLevel(); os << "exiting state, level: "; if (level < 0) os << "all"; else os << level; os << endl; break; } }
If you find a bug in source-highlight, please send electronic mail to
bug-source-highlight at gnu dot org
Include the version number, which you can find by running ‘source-highlight --version’. Also include in your message the output that the program produced and the output you expected.
If you have other questions, comments or suggestions about source-highlight, contact the author via electronic mail (find the address at http://www.lorenzobettini.it). The author will try to help you out, although he may not have time to fix your problems.
The following mailing lists are available:
help-source-highlight at gnu dot org
for generic discussions about the program and for asking for help about it (open mailing list), http://mail.gnu.org/mailman/listinfo/help-source-highlight
info-source-highlight at gnu dot org
for receiving information about new releases and features (read-only mailing list), http://mail.gnu.org/mailman/listinfo/info-source-highlight.
If you want to subscribe to a mailing list just go to the URL and follow the instructions, or send me an e-mail and I'll subscribe you.
I'll describe new features in new releases also in my blog, at this URL:
http://tronprog.blogspot.com/search/label/source-highlight
--with-doxygen
: InstallationDebugListener
class: Events and Listenersformat
method: Customizing FormattingFormatter
class: Customizing FormattingFormatterManager
class: Customizing FormattingFormatterParams
class: Customizing FormattinggetHighlightState
method: Customizing Formattinghighlight
method: SourceHighlight classHighlightEvent
class: Events and ListenersHighlightEventListener
class: Events and ListenersHighlightEventType
enum: Events and ListenershighlightParagraph
method: Customizing FormattingHighlightRuleFactory
class: Customizing FormattingHighlightState
class: Customizing FormattingHighlightToken
class: Events and ListenersLangDefManager
class: Customizing FormattingLangMap
class: SourceHighlight classnotify
method: Events and ListenersPKG_CONFIG_PATH
: Using Automake and AutotoolsRegexRuleFactory
class: Customizing Formattingsource-highlight
: Events and Listenerssource-highlight
: SourceHighlight classsource-highlight-settings
: SourceHighlight classSourceHighlight
class: Customizing FormattingSourceHighlight
class: SourceHighlight classSourceHighlighter
class: Customizing Formattingstart
field: Customizing Formattingstd::exception
class: Main Classes[1] The extension of course depends
on the library being shared or static, e.g., .so
, .la
,
.a
, and on the system
[2] http://pkg-config.freedesktop.org.