Converts HTML to Text with tables in tact
HTML::FormatText::WithLinks takes HTML and turns it into plain text
but prints all the links in the HTML as footnotes. By default, it
attempts to mimic the format of the lynx text based web browser's
--dump option.
HTML::HTML5::Parse is substantially the same as the non-CPAN module
Whatpm::HTML. Changes include:
* Provides an XML::LibXML-like DOM interface. If you usually use
XML::LibXML's DOM parser, this should be a drop-in solution for tag
soup HTML.
* Constructs an XML::LibXML::Document as the result of parsing.
* Via bundling and modifications, removed external dependencies
on non-CPAN packages.
JavaScript::Minifier::XS is a JavaScript "minifier"; its designed to remove
un-necessary whitespace and comments from JavaScript files, which also not
breaking the JavaScript.
This module takes as input an address or post box in free format
text and attempts to parse it. If successful, the address is broken
down into components and useful functions can be performed.
Lingua::EN::FindNumber provides a regular expression for finding numbers in
English text. It also provides functions for extracting and manipulating such
numbers.
This module extends the functionality of Lingua::EN::Inflect with
three new functions available for export.
Inflect short English Phrases.
You have two databases of person records that need to be synchronized
or matched up, but they use different keys--maybe one uses SSN and
the other uses employee id. The only fields you have to match on
are first and last name.
That's what this module is for.
Just feed the first and last names to the name_eq() function, and
it returns undef for no possible match, and a percentage of certainty
(rank) otherwise.
Seamus Venasse <svenasse@polaris.ca>
HTML::Entities::Numbered is a content conversion filter for named HTML
entities (symbols, mathematical symbols, Greek letters, Latin letters,
etc.).