XInclude task for Jakarta Ant. The Ant XInclude task allows Ant
build files to apply XInclude processing to XML files. It is
based on the XInclude processor developed by Elliotte Rusty
Harold. You need the xincluder.jar file for the XInclude task to
work properly. Please put this JAR in your CLASSPATH along with
the xinclude-task JAR file.
Antiword is a free MS Word reader. It converts the binary files from
Word 2, 6, 7, 97, 2000, 2002 and 2003 to plain text and to PostScript.
The POI project consists of APIs for manipulating various file formats based
upon Microsoft's OLE 2 Compound Document format using pure Java. In short, you
can read and write MS Excel files using Java. Soon, you'll be able to read and
write Word files using Java. POI is your Java Excel solution as well as your
Word Excel solution. However, we have a complete API for porting other OLE 2
Compound Document formats and welcome others to participate.
Solr is an open source enterprise search server based on the Lucene
Java search library, with XML/HTTP and JSON APIs, hit highlighting,
faceted search, caching, replication, a web administration interface
and many more features. It runs in a Java servlet container such as
Tomcat.
Apertium is an open-source machine translation platform, initially aimed
at related-language pairs but recently expanded to deal with more
divergent language pairs (such as English-Catalan). The platform
provides:
1. a language-independent machine translation engine
2. tools to manage the linguistic data necessary to build a machine
translation system for a given language pair and
3. linguistic data for a growing number of language pairs
Lingua::Stem::Fr uses the modified version of the Porter Stemming Algorithm to
return a stemmed words.
Lingua::Stem::It applies the Porter Stemming Algorithm to its parameters,
returning the stemmed words.
ArCHMage is the extensible reader/decompiler of files in CHM format(Microsoft
HTML help, also known as Compiled HTML). ArCHMage is based on chmlib by Jed
Wing and is written on python.
Artha is a free cross-platform English thesaurus that works completely
off-line and is based on WordNet. Stable releases for download are
currently available for GNU/Linux and Microsoft Windows; it is tested
on major Desktop Environments like GNOME, KDE, Xfce, etc and on Microsoft
Windows XP, Vista and 7. Artha is released under the GNU General Public
Licence version 2; hence you are free to copy/redistribute it.
The stem function takes a scalar as a parameter and stems the word according to
Martin Porters Danish stemming algorithm.