An object-oriented SGML/XML parser toolkit and DSSSL engine.
Features summary:
* Includes nsgmls
* Provides access to all information about SGML document
* Supports almost all optional SGML features
* Sophisticated entity manager
* Supports multi-byte character sets
* Object-oriented
* Written in C++ from scratch
* Fast
* Portable
* Production quality
* Free
Note: This port is a superset of the sp port. If you have sp
installed, it is recommended that you remove it before installing
jade.
John Fieber
jfieber@FreeBSD.org
This is a keyboard for input of the complex Biblical Hebrew (including
cantillation marks) with Unicode fonts. It is written in Keyman keyboard
language and developed by SIL Non-Roman Script Initiative (NRSI).
This port installs the keyboard so that it can be used through SCIM or
IBus KMFL IMEngine (textproc/scim-kmfl-imengine, textproc/ibus-kmfl).
The keyboard is provided under the terms of MIT/X11 License.
http://scripts.sil.org/SILHebrUni_Documentation
This library supports full W3C XML Schema regular expressions inclusive
all Unicode character sets and blocks. It is implemented by the
technique of derivations of regular expressions. The W3C syntax is
extended to support not only union of regular sets, but also
intersection, set difference, exor. Matching of subexpressions is also
supported. The library can be used for constricting lightweight
scanners and tokenizers. It is a standalone library, no external regex
libraries are used.
Libxslt is the XSLT C library developed for the GNOME project. XSLT itself is
a an XML language to define transformation for XML. Libxslt is based on
libxml2, the XML C library developed for the GNOME project. It also implements
most of the EXSLT set of processor-portable extensions functions and some of
Saxon's evaluate and expressions extensions.
People can either embed the library in their application or use xsltproc the
command line processing tool.
SAC (Simple API for CSS) is an event-based API much like SAX for XML.
If you are familiar with the latter, you should have little trouble
getting used to SAC. More information on SAC can be found online at
http://www.w3.org/TR/SAC.
CSS having more constructs than XML, core SAC is still more complex than
core SAX. However, if you need to parse a CSS style sheet, SAC probably
remains the easiest way to get it done.
DelimMatch allows you to match delimited substrings in a buffer. The
delimiters can be specified with any regular expression and the start
and end delimiters need not be the same. If the delimited text is
properly nested, entire nested groups are returned.
In addition, you may specify quoting and escaping characters that
contribute to the recognition of start and end delimiters.
-Anton
<tobez@FreeBSD.org>
You have two databases of person records that need to be synchronized
or matched up, but they use different keys--maybe one uses SSN and
the other uses employee id. The only fields you have to match on
are first and last name.
That's what this module is for.
Just feed the first and last names to the name_eq() function, and
it returns undef for no possible match, and a percentage of certainty
(rank) otherwise.
Seamus Venasse <svenasse@polaris.ca>
This class knows how to read two treebank formats, the Penn format
and the Chomsky Normal Form (CNF) format. These formats differ in
how they handle terminal nodes. The Penn format places pre-terminal
part of speech tags in the left-hand position of a
parenthesis-delimited pair, just like it does non-terminal nodes.
The CNF format attaches pre-terminal tags to the word with an
underscore.
The POI project consists of APIs for manipulating various file formats based
upon Microsoft's OLE 2 Compound Document format using pure Java. In short, you
can read and write MS Excel files using Java. Soon, you'll be able to read and
write Word files using Java. POI is your Java Excel solution as well as your
Word Excel solution. However, we have a complete API for porting other OLE 2
Compound Document formats and welcome others to participate.
Artha is a free cross-platform English thesaurus that works completely
off-line and is based on WordNet. Stable releases for download are
currently available for GNU/Linux and Microsoft Windows; it is tested
on major Desktop Environments like GNOME, KDE, Xfce, etc and on Microsoft
Windows XP, Vista and 7. Artha is released under the GNU General Public
Licence version 2; hence you are free to copy/redistribute it.