Lingua::EN::FindNumber provides a regular expression for finding numbers in
English text. It also provides functions for extracting and manipulating such
numbers.
"Named entities" is the NLP jargon for proper nouns which
represent people, places, organisations, and so on.
This module provides a very simple way of extracting these from a text.
If we run the "extract_entities" routine on a piece of news coverage of
recent UK political events, we should expect to see it return a list of
hash references looking like this:
{ entity => 'Mr Howard', class => 'person', scores => { ... }, },
{ entity => 'Ministry of Defence', class => 'organisation', ... },
{ entity => 'Oxfordshire', class => 'place', ... },
The additional "scores" hash reference in there breaks down the various
possible classes for this entity in an open-ended scale.
IDNA::Punycode is a module to encode / decode Unicode strings into
Punycode, an efficient encoding of Unicode for use with IDNA.
The purpose of the PPIx-Regexp package is to parse regular expressions
in a manner similar to the way the PPI package parses Perl.
Antiword is a free MS Word reader. It converts the binary files from
Word 2, 6, 7, 97, 2000, 2002 and 2003 to plain text and to PostScript.
The Perl-Critic-Tics distribution includes extra policies for Perl::Critic to
address a fairly random assortment of things that make me (rjbs) wince.
SVG.pm is a perl extention to generate stand-alone or inline SVG
(scaleable vector graphics) images using the W3C SVG xml recommendation.
This module implements a parser to convert Pod documents into a simple
object model form known hereafter as the Pod Object Model. The object
model is generated as a hierarchical tree of nodes, each of which
represents a different element of the original document. The tree can
be walked manually and the nodes examined, printed or otherwise
manipulated. In addition, Pod::POM supports and provides view objects
which can automatically traverse the tree, or section thereof, and
generate an output representation in one form or another.
Search::Odeum is an interface to the Odeum API. Odeum is the inverted index API
which is a part of qdbm database library.
This distribution contains SGMLS.pm, a perl5 class library for parsing
the output from James Clark's SGMLS and NSGMLS parsers.