HTML::TagFilter is a subclass of HTML::Parser with a single purpose: it
will remove unwanted html tags and attributes from a piece of text. It
can act in a more or less fine-grained way - you can specify permitted
tags, permitted attributes of each tag, and permitted values for each
attribute in as much detail as you like.
HTML::Tidy is an HTML checker in a handy dandy object. It's meant as a
replacement for HTML::Lint. If you're currently an HTML::Lint user looking to
migrate, see the section "Converting from HTML::Lint".
When working with text it is convenient and common to want to truncate
strings to make them fit a desired context. E.g., you might have a menu
that is only 100px wide and prefer text doesn't wrap so you'd truncate
it around 15-30 characters, depending on preference and typeface size.
This is trivial with plain text and substr but with HTML it is somewhat
difficult because whitespace has fluid significance and open tags that
are not properly closed destroy well-formedness and can wreck an entire
layout.
HTML::Truncate attempts to account for those two problems by padding
truncation for spacing and entities and closing any tags that remain
open at the point of truncation.
Data::Phrasebook is a collection of modules for accessing phrasebooks
from various data sources.
Common uses of phrasebooks are in handling error codes, accessing
databases via SQL queries and written language phrases. Examples are the
mime.types file and the hosts file, both of which use a simple
phrasebook design.
Hailo is a fast and lightweight markov engine intended to replace AI::MegaHAL.
Hailso has a Mouse (or Moose) based core with pluggable storage, tokenizer and
engine backends.
Hailo is similar to MegaHAL in functionality, the main differences (with the
default backends) being better scalability, drastically less memory usage, an
improved tokenizer, and tidier output.
With Hailo, you can create, modify, and query Hailo brains. To use Hailo in
event-driven POE applications, you can use the POE::Component::Hailo wrapper.
Hash::Merge merges two arbitrarily deep hashes into a single hash.
IO::CSVHeaderFile is a module that adds read/write CSV capabilities.
KinoSearch is a loose port of the Java search engine library Apache Lucene,
written in Perl and C. The archetypal application is website search, but it
can be put to many different uses.
KinoSearch1 is a fork of KinoSearch version 0.165 intended to provide stability
and backwards compatibility. For the latest features, see the main branch.
Features
* Extremely fast and scalable - can handle millions of documents
* Full support for 12 Indo-European languages.
* Support for boolean operators AND, OR, and AND NOT; parenthetical
groupings, and prepended +plus and -minus
* Algorithmic selection of relevant excerpts and highlighting of search terms
within excerpts
* Highly customizable query and indexing APIs
* Phrase matching
* Stemming
* Stoplists
Kwalify is a parser, schema validator, and data binding tool for
YAML and JSON.
This package provides a Perl 5 implementation of Kwalify.
The LaTeX::Driver module encapsulates the details of invoking the
Latex programs to format a LaTeX document. Formatting with LaTeX
is complicated; there are potentially many programs to run and the
output of those programs must be monitored to determine whether
further processing is required.