This manual is for GNU libextractor (version 1.0.0, 5 September 2012).
GNU libextractor is a GNU package.
Copyright © 2007, 2010, 2012 Christian Grothoff
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".
Appendices
Indices
GNU libextractor is GNU's library for extracting meta data from files. Meta data includes format information (such as mime type, image dimensions, color depth, recording frequency), content descriptions (such as document title or document description) and copyright information (such as license, author and contributors). Meta data extraction is an inherently uncertain business — a parse error can be a corrupt file, an incompatibility in the file format version, an entirely different file format or a bug in the parser. As a result of this uncertainty, GNU libextractor deliberately avoids to ever report any errors. Unexpected file contents simply result in less or possibly no meta data being extracted.
GNU libextractor uses plugins to handle various file formats. Technically a plugin can support multiple file formats; however, most plugins only support one particular format. By default, GNU libextractor will use all plugins that are available and found in the plugin installation directory. Applications can request the use of only specific plugins or the exclusion of certain plugins.
GNU libextractor is distributed with the extract command1 which is a command-line tool for extracting meta data. extract is given a list of filenames and prints the resulting meta data to the console. The extract source code also serves as an advanced example for how to use GNU libextractor.
This manual focuses on providing documentation for writing software with GNU libextractor. The only relevant parts for end-users are the chapter on compiling and installing GNU libextractor (See Preparation.). Also, the chapter on existing plugins maybe of interest (See Existing Plugins.). Additional documentation for end-users can be find in the man page on extract (using man extract).
GNU libextractor is licensed under the GNU General Public License, specifically, since version 0.7, GNU libextractor is licensed under GPLv3 or any later version.
This chapter first describes the general build instructions that should apply to all systems. Specific instructions for known problems for particular platforms are then described in individual sections afterwards.
Compiling GNU libextractor follows the standard GNU autotools build process using configure and make. For details on the GNU autotools build process, read the INSTALL file and query ./configure --help for additional options.
GNU libextractor has various dependencies, most of which are optional. Instead of specifying the names of the software packages, we will give the list in terms of the names of the respective Debian (unstable) packages that should be installed.
You absolutely need:
Recommended dependencies are:
For Subversion access and compilation one also needs:
Please notify us if we missed some dependencies (note that the list is supposed to only list direct dependencies, not transitive dependencies).
Once you have compiled and installed GNU libextractor, you should have a file extractor.h installed in your include/ directory. This file should be the starting point for your C and C++ development with GNU libextractor. The build process also installs the extract binary and man pages for extract and GNU libextractor. The extract man page documents the extract tool. The GNU libextractor man page gives a brief summary of the C API for GNU libextractor.
When you install GNU libextractor, various plugins will be installed in the lib/libextractor/ directory. The main library will be installed as lib/libextractor.so. Note that GNU libextractor will attempt to find the plugins relative to the path of the main library. Consequently, a package manager can move the library and its plugins to a different location later — as long as the relative path between the main library and the plugins is preserved. As a method of last resort, the user can specify an environment variable LIBEXTRACTOR_PREFIX. If GNU libextractor cannot locate a plugin, it will look in LIBEXTRACTOR_PREFIX/lib/libextractor/.
Should work using the standard instructions without problems.
Should work using the standard instructions without problems.
OpenBSD 3.8 also doesn't have CODESET in langinfo.h. CODESET is used in GNU libextractor in about three places. This causes problems during compilation.
No reports so far.
Linking -lstdc++ with the provided libtool fails on Cygwin, this is a problem with libtool, there is unfortunately no flag to tell libtool how to do its job on Cygwin and it seems that it cannot be the default to set the library check to 'pass_all'. Patching libtool may help.
Note: this is a rather dated report and may no longer apply.
libextractor has two installation methods on Mac OS X: it can be installed as a Mac OS X framework or with the standard ./configure; make; make install shell commands. The framework package is self-contained, but currently omits some of the extractor plugins that can be compiled in if libextractor is installed with ./configure; make; make install (provided that the required dependencies exist.)
The binary framework is distributed as a disk image (Extractor-x.x.xx.dmg). Installation is done by opening the disk image and clicking Extractor.pkg inside it. The Mac OS X installer application will then run. The framework is installed to the root volume's /Library/Frameworks folder and installing will require admin privileges.
The framework can be uninstalled by dragging /Library/Frameworks/Extractor.framework cto the Trash.
In the framework, the extract command line tool can be found at /Library/Frameworks/Extractor.framework/Versions/Current/bin/extract
The framework can be used in software projects as a framework or as a dynamic library.
When using the framework as a dynamic library in projects using autotools, one would most likely want to add "-I/Library/Frameworks/Extractor.framework/Versions/Current/include" to CPPFLAGS and "-L/Library/Frameworks/Extractor.framework/Versions/Current/lib" to LDFLAGS.
// hello.c #include <Extractor/extractor.h> int main (int argc, char **argv) { struct EXTRACTOR_PluginList *el; el = EXTRACTOR_plugin_load_defaults (EXTRACTOR_OPTION_DEFAULT_POLICY); // ... EXTRACTOR_plugin_remove_all (el); return 0; }
You can then compile the example using
$ gcc -o hello hello.c -framework Extractor
// hello.c #include <extractor.h> int main() { struct EXTRACTOR_PluginList *el; el = EXTRACTOR_plugin_load_defaults (EXTRACTOR_OPTION_DEFAULT_POLICY); // ... EXTRACTOR_plugin_remove_all (el); return 0; }
You can then compile the example using
$ gcc -I/Library/Frameworks/Extractor.framework/Versions/Current/include \ -o hello hello.c \ -L/Library/Frameworks/Extractor.framework/Versions/Current/lib \ -lextractor
Notice the difference in the #include
line.
The suggested way to package GNU libextractor is to split it into roughly the following binary packages:
This would enable minimal installations (i.e. for embedded systems) to not include any plugins, as well as moderate-size installations (that do not trigger GTK and X11) for systems that have limited resources. Right now, the MP4 plugin is experimental and does nothing and should thus never be included at all. The gstreamer plugin is experimental but largely works with the correct version of gstreamer and can thus be packaged (especially if the dependency is available on the target system) but should probably not be part of libextractor-plugins-all.
The extract command takes a list of file names as arguments, extracts meta data from each of those files and prints the result to the console. By default, extract will use all available plugins and print all (non-binary) meta data that is found.
The set of plugins used by extract can be controlled using the “-l” and “-n” options. Use “-n” to not load all of the default plugins. Use “-l NAME” to specifically load a certain plugin. For example, specify “-n -l mime” to only use the MIME plugin.
Using the “-p” option the output of extract can be limited to only certain keyword types. Similarly, using the “-x” option, certain keyword types can be excluded. A list of all known keyword types can be obtained using the “-L” option.
The output format of extract can be influenced with the “-V” (more verbose, lists filenames), “-g” (grep-friendly, all meta data on a single line per file) and “-b” (bibTeX style) options.
$ extract test/test.jpg comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1 mimetype - image/jpeg $ extract -V -x comment test/test.jpg Keywords for file test/test.jpg: mimetype - image/jpeg $ extract -p comment test/test.jpg comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1 $ extract -nV -l png.so -p comment test/test.jpg test/test.png Keywords for file test/test.jpg: Keywords for file test/test.png: comment - Testing keyword extraction
Each public symbol exported by GNU libextractor has the prefix EXTRACTOR_. All-caps names are used for constants. For the impatient, the minimal C code for using GNU libextractor (on the executing binary itself) looks like this:
#include <extractor.h> int main (int argc, char ** argv) { struct EXTRACTOR_PluginList *plugins = EXTRACTOR_plugin_add_defaults (EXTRACTOR_OPTION_DEFAULT_POLICY); EXTRACTOR_extract (plugins, argv[1], NULL, 0, &EXTRACTOR_meta_data_print, stdout); EXTRACTOR_plugin_remove_all (plugins); return 0; }
The minimal API illustrated by this example is actually sufficient for many applications. The full external C API of GNU libextractor is described in chapter See Extracting meta data. Bindings for other languages are described in chapter See Language bindings. The API for writing new plugins is described in chapter See Writing new Plugins.
In order to extract meta data with GNU libextractor you first need to load the respective plugins and then call the extraction API with the plugins and the data to process. This section documents how to load and unload plugins, the various types and formats in which meta data is returned to the application and finally the extraction API itself.
Using GNU libextractor from a multi-threaded parent process requires some care. The problem is that on most platforms GNU libextractor starts sub-processes for the actual extraction work. This is useful to isolate the parent process from potential bugs; however, it can cause problems if the parent process is multi-threaded. The issue is that at the time of the fork, another thread of the application may hold a lock (i.e. in gettext or libc). That lock would then never be released in the child process (as the other thread is not present in the child process). As a result, the child process would then deadlock on trying to acquire the lock and never terminate. This has actually been observed with a lock in GNU gettext that is triggered by the plugin startup code when it interacts with libltdl.
The problem can be solved by loading the plugins using the
EXTRACTOR_OPTION_IN_PROCESS
option, which will run GNU libextractor
in-process and thus avoid the locking issue. In this case, all of the
functions for loading and unloading plugins, including
EXTRACTOR_plugin_add_defaults and
EXTRACTOR_plugin_remove_all, are thread-safe and reentrant.
However, using the same plugin list from multiple threads at the same
time is not safe.
All plugin code is expected required to be reentrant and state-less, but due to the extensive use of 3rd party libraries this cannot be guaranteed.
A plugin list represents a set of GNU libextractor plugins. Most of the GNU libextractor API is concerned with either constructing a plugin list or using it to extract meta data. The internal representation of the plugin list is of no concern to users or plugin developers.
Unloads a particular plugin. The given name should be the short name of the plugin, for example “mime” for the mime-type extractor or “mpeg” for the MPEG extractor.
Loads a particular plugin. The plugin is added to the existing list, which can be NULL. The second argument specifies the name of the plugin (i.e. “ogg”). The third argument can be NULL and specifies plugin-specific options. Finally, the last argument specifies if the plugin should be executed out-of-process (
EXTRACTOR_OPTION_DEFAULT_POLICY
) or not.
Loads and unloads plugins based on a configuration string, modifying the existing list, which can be NULL. The string has the format “[-]NAME(OPTIONS){:[-]NAME(OPTIONS)}*”. Prefixing the plugin name with a “-” means that the plugin should be unloaded.
Loads all of the plugins in the plugin directory. This function is what most GNU libextractor applications should use to setup the plugins.
enum EXTRACTOR_MetaType is a C enum which defines a list of over 100 different types of meta data. The total number can differ between different GNU libextractor releases; the maximum value for the current release can be obtained using the EXTRACTOR_metatype_get_max function. All values in this enumeration are of the form EXTRACTOR_METATYPE_XXX.
The function EXTRACTOR_metatype_to_string can be used to obtain a short English string ‘s’ describing the meta data type. The string can be translated into other languages using GNU gettext with the domain set to GNU libextractor (dgettext("libextractor", s)).
The function EXTRACTOR_metatype_to_description can be used to obtain a longer English string ‘s’ describing the meta data type. The description may be empty if the short description returned by
EXTRACTOR_metatype_to_string
is already comprehensive. The string can be translated into other languages using GNU gettext with the domain set to GNU libextractor (dgettext("libextractor", s)).
enum EXTRACTOR_MetaFormat is a C enum which defines on a high level how the extracted meta data is represented. Currently, the library uses three formats: UTF-8 strings, C strings and binary data. A fourth value, EXTRACTOR_METAFORMAT_UNKNOWN
is defined but not used. UTF-8 strings are 0-terminated strings that have been converted to UTF-8. The format code is EXTRACTOR_METAFORMAT_UTF8
. Ideally, most text meta data will be of this format. Some file formats fail to specify the encoding used for the text. In this case, the text cannot be converted to UTF-8. However, the meta data is still known to be 0-terminated and presumably human-readable. In this case, the format code used is EXTRACTOR_METAFORMAT_C_STRING
; however, this should not be understood to mean that the encoding is the same as that used by the C compiler. Finally, for binary data (mostly images), the format EXTRACTOR_METAFORMAT_BINARY
is used.
Naturally this is not a precise description of the meta format. Plugins can provide a more precise description (if known) by providing the respective mime type of the meta data. For example, binary image meta data could be also tagged as “image/png” and normal text would typically be tagged as “text/plain”.
Type of a function that libextractor calls for each meta data item found.
- cls
- closure (user-defined)
- plugin_name
- name of the plugin that produced this value; special values can be used (i.e. '<zlib>' for zlib being used in the main libextractor library and yielding meta data);
- type
- libextractor-type describing the meta data;
- format basic
- format information about data
- data_mime_type
- mime-type of data (not of the original file); can be NULL (if mime-type is not known);
- data
- actual meta-data found
- data_len
- number of bytes in data
Return 0 to continue extracting, 1 to abort.
This is the main function for extracting keywords with GNU libextractor. The first argument is a plugin list which specifies the set of plugins that should be used for extracting meta data. The ‘filename’ argument is optional and can be used to specify the name of a file to process. If ‘filename’ is NULL, then the ‘data’ argument must point to the in-memory data to extract meta data from. If ‘filename’ is non-NULL, ‘data’ can be NULL. If ‘data’ is non-null, then ‘size’ is the size of ‘data’ in bytes. Otherwise ‘size’ should be zero. For each meta data item found, GNU libextractor will call the ‘proc’ function, passing ‘proc_cls’ as the first argument to ‘proc’. The other arguments to ‘proc’ depend on the specific meta data found.
Meta data extraction should never really fail — at worst, GNU libextractor should not call ‘proc’ with any meta data. By design, GNU libextractor should never crash or leak memory, even given corrupt files as input. Note however, that running GNU libextractor on a corrupt file system (or incorrectly mmaped files) can result in the operating system sending a SIGBUS (bus error) to the process. While GNU libextractor runs plugins out-of-process, it first maps the file into memory and then attempts to decompress it. During decompression it is possible to encounter a SIGBUS. GNU libextractor will not attempt to catch this signal and your application is likely to crash. Note again that this should only happen if the file system is corrupt (not if individual files are corrupt). If this is not acceptable, you might want to consider running GNU libextractor itself also out-of-process (as done, for example, by doodle).
GNU libextractor works immediately with C and C++ code. Bindings for Java, Mono, Ruby, Perl, PHP and Python are available for download from the main GNU libextractor website. Documentation for these bindings (if available) is part of the downloads for the respective binding. In all cases, a full installation of the C library is required before the binding can be installed.
Compiling the GNU libextractor Java binding follows the usual process of running configure and make. The result will be a shared C library libextractor_java.so with the native code and a JAR file (installed to $PREFIX/share/java/libextractor.java).
A minimal example for using GNU libextractor's Java binding would look like this:
import org.gnu.libextractor.*; import java.util.ArrayList; public static void main(String[] args) { Extractor ex = Extractor.getDefault(); for (int i=0;i<args.length;i++) { ArrayList keywords = ex.extract(args[i]); System.out.println("Keywords for " + args[i] + ":"); for (int j=0;j<keywords.size();j++) System.out.println(keywords.get(j)); } }
The GNU libextractor library and the libextractor_java.so JNI binding have to be in the library search path for this to work. Furthermore, the libextractor.jar file should be on the classpath.
Note that the API does not use Java 5 style generics in order to work with older versions of Java.
his binding is undocumented at this point.
This binding is undocumented at this point.
This binding is undocumented at this point.
This binding is undocumented at this point.
This binding is undocumented at this point.
This chapter describes various utility functions for GNU libextractor usage. All of the functions are reentrant.
The constant EXTRACTOR_VERSION is a hexadecimal representation of the version number of the installed libextractor header. The hexadecimal format is 0xAABBCCDD where AA is the major version (so far always 0), BB is the minor version, CC is the revision and DD the patch number. For example, for version 0.5.18, we would have AA=0, BB=5, CC=18 and DD=0. Minor releases such as 0.5.18a or significant changes in unreleased versions would be marked with DD=1 or higher.
The EXTRACTOR_meta_data_print is a simple function which prints the meta data found with libextractor to a file. The function is mostly useful for debugging and as an example for how to manipulate the keyword list and can be passed as the ‘proc’ argument to EXTRACTOR_extract
. The file to print to should be passed as ‘proc_cls’ (which must be of type FILE *
), for example stdout
.
gzip and bzip2 compressed versions of these formats are also supported (as well as meta data embedded by gzip itself) if zlib or libbz2 are available.
Writing a new plugin for libextractor usually requires writing of or interfacing with an actual parser for a specific format. How this is can be accomplished depends on the format and cannot be specified in general. However, care should be taken for the code to be reentrant and highly fault-tolerant, especially with respect to malformed inputs.
Plugins should start by verifying that the header of the data matches the specific format and immediately return if that is not the case. Even if the header matches the expected file format, plugins must not assume that the remainder of the file is well formed.
The plugin library must be called libextractor_XXX.so, where XXX denotes the file format of the plugin. The library must export a method libextractor_XXX_extract_method, with the following signature:
void EXTRACTOR_XXX_extract_method (struct EXTRACTOR_ExtractContext *ec);
‘ec’ contains various information the plugin may need for its execution. Most importantly, it contains functions for reading (“read”) and seeking (“seek”) the input data and for returning extracted data (“proc”). The “config” member can contain additional configuration options. “proc” should be called on each meta data item found. If “proc” returns non-zero, processing should be aborted (if possible).
In order to test new plugins, the extract command can be run with the options “-ni” and “-l XXX” . This will run the plugin in-process (making it easier to debug) and without any of the other plugins.
The following example shows how a plugin can return the mime type of a file.
void EXTRACTOR_mymime_extract (struct EXTRACTOR_ExtractContext *ec) { void *data; ssize_t data_size, if (-1 == (data_size = ec->read (ec->cls, &data, 4))) return; /* read error */ if (data_size < 4) return; /* file too small */ if (0 != memcmp (data, "\177ELF", 4)) return; /* not ELF */ if (0 != ec->proc (ec->cls, "mymime", EXTRACTOR_METATYPE_MIMETYPE, EXTRACTOR_METAFORMAT_UTF8, "text/plain", "application/x-executable", 1 + strlen("application/x-executable"))) return; /* more calls to 'proc' here as needed */ }
Some plugins link against the libextractor_common
library which
provides common abstractions needed by many plugins. This section
documents this internal API for plugin developers. Note that the headers
for this library are (intentionally) not installed: we do not consider
this API stable and it should hence only be used by plugins that are
build and shipped with GNU libextractor. Third-party plugins should
not use it.
convert_numeric.h defines various conversion functions for numbers (in particular, byte-order conversion for floating point numbers).
unzip.h defines an API for accessing compressed files.
pack.h provides an interpreter for unpacking structs of integer numbers from streams and converting from big or little endian to host byte order at the same time.
convert.h provides a function for character set conversion described below.
Various GNU libextractor plugins make use of the internal convert.h header which defines a function
EXTRACTOR_common_convert_to_utf8 which can be used to easily convert text from any character set to UTF-8. This conversion is important since the linked list of keywords that is returned by GNU libextractor is expected to contain only UTF-8 strings. Naturally, proper conversion may not always be possible since some file formats fail to specify the character set. In that case, it is often better to not convert at all.
The arguments to EXTRACTOR_common_convert_to_utf8 are the input string (which does not have to be zero-terminated), the length of the input string, and the character set (which must be zero-terminated). Which character sets are supported depends on the platform, a list can generally be obtained using the iconv -l command. The return value from EXTRACTOR_common_convert_to_utf8 is a zero-terminated string in UTF-8 format. The responsibility to free the string is with the caller, so storing the string in the keyword list is acceptable.
GNU libextractor uses the Mantis bugtracking system. If possible, please report bugs there. You can also e-mail the GNU libextractor mailinglist at libextractor@gnu.org.
Copyright © 1989, 1991 Free Software Foundation, Inc. 59 Temple Place – Suite 330, Boston, MA 02111-1307, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software—to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.
Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and modification follow.
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does.
You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee.
These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program.
In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.
The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.
This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License.
Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and “any later version”, you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation.
If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the “copyright” line and a pointer to where the full notice is found.
one line to give the program's name and an idea of what it does. Copyright (C) 19yy name of author This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) 19yy name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details.
The hypothetical commands ‘show w’ and ‘show c’ should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than ‘show w’ and ‘show c’; they could even be mouse-clicks or menu items—whatever suits your program.
You should also get your employer (if you work as a programmer) or your school, if any, to sign a “copyright disclaimer” for the program, if necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. signature of Ty Coon, 1 April 1989 Ty Coon, President of Vice
This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License.
(
: ExtractingEXTRACTOR_common_convert_to_utf8
: Internal utility functionsEXTRACTOR_extract
: ExtractingEXTRACTOR_meta_data_print
: Meta data printingEXTRACTOR_metatype_get_max
: Meta typesEXTRACTOR_metatype_to_description
: Meta typesEXTRACTOR_metatype_to_string
: Meta typesEXTRACTOR_plugin_add
: Plugin managementEXTRACTOR_plugin_add_config
: Plugin managementEXTRACTOR_plugin_add_defaults
: Plugin managementEXTRACTOR_plugin_remove
: Plugin managementEXTRACTOR_plugin_remove_all
: Plugin managementEXTRACTOR_VERSION
: Utility Constants