Text::Language::Guess guesses a document's language. Its implementation
is simple: Using "Text::ExtractWords" and "Lingua::StopWords" from CPAN,
it determines how many of the known stopwords the document contains for
each language supported by "Lingua::StopWords".
Each word in the document recognized as stopword of a particular
language scores one point for this language.
The "language_guess()" function takes a document as a parameter and
returns the abbreviation of the language that it is most likely written
in.
This is an experimental distribution that attempts to intuit the underlying
indent "policy" for a text file (most likely a source code file).
Converts the EOL and EOF conventions in the passed string to a
canonicalization form that handles 'mixed' EOL conventions.
Generate random Latin looking text
Markdown is a text-to-HTML filter; it translates an easy-to-read and
easy-to-write structured text format into HTML. Markdown's text format
is most similar to that of plain text email, and supports features such
as headers, *emphasis*, code blocks, blockquotes, and links.
Markdown's syntax is designed not as a generic markup language, but
specifically to serve as a front-end to (X)HTML. You can use span-level
HTML tags anywhere in a Markdown document, and you can use block level
HTML tags (like <div> and <table> as well).
Text::FixedLength was made to be able to manipulate fixed length field
records. You can manipulate arrays of data, or files of data. This
module allows you to change between delimited and fixed length records.
-Anton
<tobez@FreeBSD.org>
Text::NSP - The Ngram Statistic Package allows a user to count
sequences of Ngrams in large corpora of text, and measure their
association.
The module NSP.pm is a stub that doesn't have any real functionality.
The real work is done by five programs:
count.pl statistic.pl rank.pl combig.pl kocos.pl
These are not modules, and are run from the command line.
+-------+ +-------------+
| BEGIN >---+ | |
+-------+ +--> Do you need |
| to make a N------+
+--------Y flowchart? | |
| | | |
| +-------------+ |
| |
| +------------+ |
| | | |
+-----V-------+ | So use it. | |
| | | | |
| Then my | +--^---V-----+ |
| module may | | | |
| help. | | | |
| >----+ | |
+-------------+ | |
| +-----V-------+
| | |
| | Then go do |
+------> something |
| else. |
| |
+-------------+
-Anton
<tobez@FreeBSD.org>
This module provides functions that deals with formatting data with
Content-Type 'text/plain; format=flowed' as described in RFC2646
(http://www.rfc-editor.org/rfc/rfc2646.txt). In a nutshell,
format=flowed text solves the problem in plain text files where it
is not known which lines can be considered a logical paragraph,
enabling lines to be automatically flowed (wrapped and/or joined)
as appropriate when displaying.
In format=flowed, a soft newline is expressed as " \n", while hard
newlines are expressed as "\n". Soft newlines can be automatically
deleted or inserted as appropriate when the text is reformatted.
The format routine will format under all circumstances even if the width
isn't enough to contain the longest words. Text::Wrap will die under
these circumstances, although I am told this is fixed. If columns is set
to a small number and words are longer than that and the leading
'whitespace' than there will be a single word on each line. This will
let you make a simple word list which could be indented or right
aligned.
-Anton
<tobez@FreeBSD.org>