Contains renderers and parsers for both XML and HTML 5 document fragments,
which share data structures so that it's easy to work with both. Document
fragments are bits of documents, which are not constrained by some of the
high-level structure rules (in particular, they may contain more than one
root element). Note that this is not a compliant HTML 5 parser. Rather,
it is a parser for HTML 5 compliant documents. It does not implement the
HTML 5 parsing algorithm, and should generally be expected to perform
correctly only on documents that you trust to conform to HTML 5. This is
not a suitable library for implementing web crawlers or other software
that will be exposed to documents from outside sources. The result is also
not the HTML 5 node structure, but rather something closer to the physical
structure. For example, omitted start tags are not inserted (and so, their
corresponding end tags must also be omitted).
`chpp' is a preprocessor. Therefore, its main purpose is to modify
input text by including other input files and by macro expansion.
What distinguishes `chpp' from other textprocessors are mainly two
features:
* `chpp' is non-intrusive. This means that you can take your
favorite text and it is very unlikely that it will be changed when
piped through `chpp'. Due to this feature it is pretty easy to
start using `chpp' since you can just start writing your text and
need not concern yourself with `chpp' sitting in the background
changing it for no obvious reason.
* `chpp' is not just a package for performing simple macro expansion,
but can indeed be considered a full-fledged programming language.
Most importantly, it provides support for complex data structures,
namely lists and hashes (associative arrays), which can be nested
arbitrarily.
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in
English. The goal is for names with the same pronunciation to be encoded to the
same representation so that they can be matched despite minor differences in
spelling. Soundex is the most widely known of all phonetic algorithms and is
often used (incorrectly) as a synonym for "phonetic algorithm". Improvements to
Soundex are the basis for many modern phonetic algorithms. (Wikipedia, 2007)
Text::Soundex implements the original soundex algorithm developed by Robert
Russell and Margaret Odell, patented in 1918 and 1922, as well as a variation
called "American Soundex" used for US census data, and current maintained by the
National Archives and Records Administration (NARA).
The soundex algorithm may be recognized from Donald Knuth's The Art of Computer
Programming. The algorithm described by Knuth is the NARA algorithm.
XML::SAX::Base has a very simple task - to be a base class for PerlSAX drivers
and filters. It's default behaviour is to pass the input directly to the output
unchanged. It can be useful to use this module as a base class so you don't have
to, for example, implement the characters() callback.
The main advantages that it provides are easy dispatching of events the right
way (ie it takes care for you of checking that the handler has implemented that
method, or has defined an AUTOLOAD), and the guarantee that filters will pass
along events that they aren't implementing to handlers downstream that might
nevertheless be interested in them.
RiCal is a new Ruby Library for parsing, generating, and using iCalendar
(RFC 2445) format data.
RiCal distinguishes itself from existing Ruby libraries in providing
support for
Timezone components in Calendars. This means that RiCal parses VTIMEZONE
data and instantiates timezone objects which can be used to convert
times in the calendar to and from UTC time. In addition, RiCal allows
created calendars and components to use time zones understood by TZInfo gem
(from either the TZInfo gem or from Rails ActiveSupport => 2.2).
When a calendar with TZInfo time zones is exported, RFC 2445 conforming
VTIMEZONE components will be included, allowing other programs to process
the result.
Enumeration of recurring occurrences. For example, if an Event has one
or more recurrence rules, then the occurrences of the event can be enumerated
as a series of Event occurrences.
hamsterdb is a lightweight embedded database engine. It is
in development for more than three years and concentrates
on ease of use, high performance, stability and portability.
The hamsterdb API is simple and self-documenting. The interface
is similar to other widely-used database engines. Fast algorithms
and data structures guarantee high performance for all scenarios.
Hamsterdb has hundreds of unittests with a test coverage of over
90%. Each release is tested with thousands of acceptance tests in
many different configurations, tested on up to six different
hardware architectures and operating systems. Written in plain
ANSI-C, hamsterdb runs on many architectures: Intel-compatible
(x86, x64), PowerPC, SPARC, ARM, RISC and others. Tested operating
systems include Microsoft Windows, Microsoft Windows CE, Linux,
SunOS and other Unices.
This module implements a very fast JSON encoder/decoder for Python.
JSON stands for JavaScript Object Notation and is a text based lightweight
data exchange format which is easy for humans to read/write and for machines
to parse/generate. JSON is completely language independent and has multiple
implementations in most of the programming languages, making it ideal for
data exchange and storage.
The module is written in C and it is up to 250 times faster when compared to
the other python JSON implementations which are written directly in python.
This speed gain varies with the complexity of the data and the operation and
is the range of 10-200 times for encoding operations and in the range of
100-250 times for decoding operations.
Foma is a compiler, programming language, and C library for constructing
finite-state automata and transducers for various uses. It has specific
support for many natural language processing applications such as producing
morphological analyzers. Although NLP applications are probably the main use
of foma, it is sufficiently generic to use for a large number of purposes.
The foma interface is similar to the Xerox xfst interface, and supports
most of the commands and the regular expression syntax in xfst.
Many grammars written for xfst compile out-of-the-box with foma.
The library contains efficient implementations of all classical
automata/transducer algorithms: determinization, minimization, epsilon-removal,
composition, boolean operations. Also, more advanced construction methods
are available: context restriction, quotients, first-order regular logic,
transducers from replacement rules, etc.
A collection of functions to create spatial weights matrix objects
from polygon contiguities, from point patterns by distance and
tesselations, for summarising these objects, and for permitting
their use in spatial data analysis, including regional aggregation
by minimum spanning tree; a collection of tests for spatial
autocorrelation, including global Moran's I, APLE, Geary's C,
Hubert/Mantel general cross product statistic, Empirical Bayes
estimates and Assuno/Reis Index, Getis/Ord G and multicoloured join
count statistics, local Moran's I and Getis/Ord G, saddlepoint
approximations and exact tests for global and local Moran's I; and
functions for estimating spatial simultaneous autoregressive (SAR)
lag and error models, impact measures for lag models, weighted and
unweighted SAR and CAR spatial regression models, semi-parametric
and Moran eigenvector spatial filtering, GM SAR error models, and
generalized spatial two stage least squares models.
microdc is a command-line based Direct Connect client written in C by Oskar
Liljeblad and designed to build and run on modern POSIX compatible systems.
It uses GNU Readline library for user interaction. Despite the command-line
user interface, microdc is quite user friendly and simple to use.
microdc2 is a future improvement (fork) of the microdc based on Oskar's code
version 0.11.0. After version 0.12.0 the project was renamed to microdc2 on
Oskar's request.
Features of microdc2 include:
- Nearly full support of the original Direct Connect protocol
- GNU Readline support for command line editing and history
- Sensible tab-completion of commands, user names, local files, remote
files, speed names, and connection names
- One process per connection for optimal transfer rates
- Small memory footprint