Dadadodo analyses text files and generates markov chains of word
frequencies; it can then generate random sentences based on that data.
hhm is a program that makes ITS files and in the future it will also
make Compiled HTML Help (CHM) files. Both types of files are a kind of
compressed archive format used on Win98, Win2K and other Microsoft
operating systems to store documentation.
Go package which provides rudimentary functions for manipulating text
in paragraphs.
This dictionary client provides access to a dictionary server (as
defined in RFC 2229) from within Emacs or XEmacs.
It supports utf-8 (currently available in Emacs 21) and allows to
follow links within the definitions.
diffmark is an XML diff and merge package. It consists of a shared C++
library, libdiffmark, plus two programs wrapping the library into a
command-line interface: dm and dm-merge. dm takes 2 XML files and
prints their diff (also an XML document) on its standard output.
dm-merge takes the first document passed to dm and its output and
produces the second document.
doc-mode is an emacs mode for editing documentation, specifically
designed for use with asciidoc (textproc/asciidoc).
"DocBook: The Definitive Guide"
by Norman Walsh and Leonard Muellner
with contributions from Bob Stayton
ISBN: 156592-580-7
This book is a gentle yet thorough introduction to the DocBook DTD (which is
used by, amongst others, the FreeBSD Documentation Project). A dead-tree
edition of the book is published by O'Reilly & Associates, Inc., but the text
is freely licensed under the GNU FDL.
The current edition purports to document DocBook v4.4 with the EBNF,
HTML Forms, MathML and SVG modules.
An unexpanded edition of version 2.0.17 is also available. In this version,
content models are shown with parameter entities rather than fully expanded.
html-pretty (or htmlpty on file systems with unpleasant filename
length restrictions) is a prettyprinter for HTML and SGML. It can
also assist in the conversion of ordinary text files in ASCII or
ISO8859-1 character sets to HTML.
dom4j is an easy to use, open source library for working with XML, XPath
and XSLT on the Java platform using the Java Collections Framework and
with full support for DOM, SAX and JAXP.
html2text is a command line utility, written in C++, that converts
HTML documents (HTML 3.2) into plain text (ISO 8859-1).
Each HTML document is loaded from a location indicated by an URI or
read from standard input, and formatted into a stream of plain text
characters that is written to standard output or into an output-file.
The input-URI may specify a remote site, from that the documents are
loaded with the Hypertext Transfer Protocol (HTTP). The program is
even able to preserve the original positions of table fields and
accepts also syntactically incorrect input, attempting to interpret it
"reasonably". The rendering is largely customisable through an RC
file.