MultiMarkdown, or MMD, is a tool to help turn minimally marked-up plain
text into well formatted documents, including HTML, PDF (by way of
LaTeX), OPML, or OpenDocument (specifically, Flat OpenDocument or
'.fodt', which can in turn be converted into RTF, Microsoft Word, or
virtually any other word-processing format).
MMD is a superset of the Markdown syntax, originally created by John
Gruber. It adds multiple syntax features (tables, footnotes, and
citations, to name a few), in addition to the various output formats
listed above (Markdown only creates HTML). Additionally, it builds in
'smart' typography for various languages (proper left- and right-sided
quotes, for example).
NOTE: To use the mmd2pdf script, you must install print/latexmk.
PXP is a validating XML parser for OCaml. It strictly complies
to the XML-1.0 standard.
The parser is simple to call, usually only one statement (function
call) is sufficient to parse an XML document and to represent it
as object tree.
Once the document is parsed, it can be accessed using a class
interface. The interface allows arbitrary access including
transformations. One of the features of the document representation
is its polymorphic nature; it is simple to add custom methods to
the document classes. Furthermore, the parser can be configured
such that different XML elements are represented by objects created
from different classes. This is a very powerful feature, because
it simplifies the structure of programs processing XML documents.
odt2txt is a command-line tool which extracts the text out of OpenDocument Texts
produced by LibreOffice, OpenOffice, StarOffice, KOffice and others.
odt2txt can also extract text from some file formats similar to OpenDocument
Text, such as OpenOffice.org XML, which was used by OpenOffice.org version 1.x
and older StarOffice versions. To a lesser extent, odt2txt may be useful to
extract content from OpenDocument spreadsheets and OpenDocument presentations.
odt2txt is:
- small
- supports multiple output encodings
- adopts to your locale
- able to substitute common characters which the output charset does not contain
with ascii look-a-likes
- written in C, has few dependencies
- portable (runs on Linux, Mac OS X, Windows, *BSD, Cygwin, Solaris, HP-UX)
Surely the CPAN doesn't need yet another CSV parsing module.
Text::CSV_XS is the standard parser for CSV files. It is fast
as hell, but unfortunately it can be a bit verbose to use.
A number of other modules have attempted to put usability
wrappers around this venerable module, but they have all
focussed on parsing the entire file into memory at once.
This method is fine unless your CSV files start to get large.
Once that happens, the only existing option is to fall back
on the relatively slow and heavyweight XML::SAXDriver::CSV
module.
Parse::CSV fills this functionality gap. It provides a flexible
and light-weight streaming parser for large, extremely large,
or arbitrarily large CSV files.
The Petal::Utils package contains commonly used Petal modifiers (or
plugins), and bundles them with an easy-to-use installation interface. By
default, a set of modifiers are installed into Petal when you use this
module. You can change which modifiers are installed by naming them after
the use statement:
# use the default set:
use Petal::Utils qw( :default );
# use the date set of modifiers:
use Petal::Utils qw( :date );
# use only named modifiers, plus the debug set:
use Petal::Utils qw( UpperCase Date :debug );
# don't install any modifiers
use Petal::Utils qw();
You'll find a list of plugin sets throughout this document. You can also
get a complete list by looking at the variable:
%Petal::Utils::PLUGIN_SET;
For details on how the plugins are installed, see the "Advanced Petal"
section of the Petal documentation.
This module provides encoding to LaTeX escapes from utf8 using mapping
tables in Pod::LaTeX and HTML::Entities. This covers only a subset of the
Unicode character table (undef warnings will occur for non-mapped chars).
Mileage will vary when decoding (converting TeX to utf8), as TeX is in
essence a programming language, and this module does not implement TeX.
I use this module to encode author names in BibTeX and to do a rough job
at presenting LaTeX abstracts in HTML. Using decode rather than seeing
$\sqrt{\Omega^2\zeta_n}$ you get something that looks like the formula.
The next logical step for this module is to integrate some level of TeX
grammar to improve the decoding, in particular to handle fractions and
font changes (which should probably be dropped).
Greeking is the use of random letters or marks to show the overall appearance
of a printed page without showing the actual text. Greeking is used to make
it easy to judge the overall appearance of a document without being distracted
by the meaning of the text.
This is a module is for quickly generating varying meaningless text from any
source to create this illusion of the content in systems.
This module was created to quickly give developers simulated content to fill
systems with simulated content. Instead of static Lorem Ipsum text, by using
randomly generated text and optionally varying word sources, repetitive and
monotonous patterns that do not represent real system usage is avoided.
`chpp' is a preprocessor. Therefore, its main purpose is to modify
input text by including other input files and by macro expansion.
What distinguishes `chpp' from other textprocessors are mainly two
features:
* `chpp' is non-intrusive. This means that you can take your
favorite text and it is very unlikely that it will be changed when
piped through `chpp'. Due to this feature it is pretty easy to
start using `chpp' since you can just start writing your text and
need not concern yourself with `chpp' sitting in the background
changing it for no obvious reason.
* `chpp' is not just a package for performing simple macro expansion,
but can indeed be considered a full-fledged programming language.
Most importantly, it provides support for complex data structures,
namely lists and hashes (associative arrays), which can be nested
arbitrarily.
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in
English. The goal is for names with the same pronunciation to be encoded to the
same representation so that they can be matched despite minor differences in
spelling. Soundex is the most widely known of all phonetic algorithms and is
often used (incorrectly) as a synonym for "phonetic algorithm". Improvements to
Soundex are the basis for many modern phonetic algorithms. (Wikipedia, 2007)
Text::Soundex implements the original soundex algorithm developed by Robert
Russell and Margaret Odell, patented in 1918 and 1922, as well as a variation
called "American Soundex" used for US census data, and current maintained by the
National Archives and Records Administration (NARA).
The soundex algorithm may be recognized from Donald Knuth's The Art of Computer
Programming. The algorithm described by Knuth is the NARA algorithm.
XML::SAX::Base has a very simple task - to be a base class for PerlSAX drivers
and filters. It's default behaviour is to pass the input directly to the output
unchanged. It can be useful to use this module as a base class so you don't have
to, for example, implement the characters() callback.
The main advantages that it provides are easy dispatching of events the right
way (ie it takes care for you of checking that the handler has implemented that
method, or has defined an AUTOLOAD), and the guarantee that filters will pass
along events that they aren't implementing to handlers downstream that might
nevertheless be interested in them.