textproc/xmlhtml- (Score: 2.549062E-4)
XML parser and renderer with HTML 5 quirks mode
Contains renderers and parsers for both XML and HTML 5 document fragments, which share data structures so that it's easy to work with both. Document fragments are bits of documents, which are not constrained by some of the high-level structure rules (in particular, they may contain more than one root element). Note that this is not a compliant HTML 5 parser. Rather, it is a parser for HTML 5 compliant documents. It does not implement the HTML 5 parsing algorithm, and should generally be expected to perform correctly only on documents that you trust to conform to HTML 5. This is not a suitable library for implementing web crawlers or other software that will be exposed to documents from outside sources. The result is also not the HTML 5 node structure, but rather something closer to the physical structure. For example, omitted start tags are not inserted (and so, their corresponding end tags must also be omitted).
textproc/chpp-0.3.5 (Score: 2.549062E-4)
Non-intrusive full-featured text preprocessor
`chpp' is a preprocessor. Therefore, its main purpose is to modify input text by including other input files and by macro expansion. What distinguishes `chpp' from other textprocessors are mainly two features: * `chpp' is non-intrusive. This means that you can take your favorite text and it is very unlikely that it will be changed when piped through `chpp'. Due to this feature it is pretty easy to start using `chpp' since you can just start writing your text and need not concern yourself with `chpp' sitting in the background changing it for no obvious reason. * `chpp' is not just a package for performing simple macro expansion, but can indeed be considered a full-fledged programming language. Most importantly, it provides support for complex data structures, namely lists and hashes (associative arrays), which can be nested arbitrarily.
textproc/Text-Soundex-3.05 (Score: 2.549062E-4)
Implementation of the soundex algorithm
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for names with the same pronunciation to be encoded to the same representation so that they can be matched despite minor differences in spelling. Soundex is the most widely known of all phonetic algorithms and is often used (incorrectly) as a synonym for "phonetic algorithm". Improvements to Soundex are the basis for many modern phonetic algorithms. (Wikipedia, 2007) Text::Soundex implements the original soundex algorithm developed by Robert Russell and Margaret Odell, patented in 1918 and 1922, as well as a variation called "American Soundex" used for US census data, and current maintained by the National Archives and Records Administration (NARA). The soundex algorithm may be recognized from Donald Knuth's The Art of Computer Programming. The algorithm described by Knuth is the NARA algorithm.
textproc/XML-SAX-Base-1.08 (Score: 2.549062E-4)
Base class SAX Drivers and Filters
XML::SAX::Base has a very simple task - to be a base class for PerlSAX drivers and filters. It's default behaviour is to pass the input directly to the output unchanged. It can be useful to use this module as a base class so you don't have to, for example, implement the characters() callback. The main advantages that it provides are easy dispatching of events the right way (ie it takes care for you of checking that the handler has implemented that method, or has defined an AUTOLOAD), and the guarantee that filters will pass along events that they aren't implementing to handlers downstream that might nevertheless be interested in them.
textproc/ri_cal-0.8.8 (Score: 2.549062E-4)
Library for parsing, and generating iCalendar data
RiCal is a new Ruby Library for parsing, generating, and using iCalendar (RFC 2445) format data. RiCal distinguishes itself from existing Ruby libraries in providing support for Timezone components in Calendars. This means that RiCal parses VTIMEZONE data and instantiates timezone objects which can be used to convert times in the calendar to and from UTC time. In addition, RiCal allows created calendars and components to use time zones understood by TZInfo gem (from either the TZInfo gem or from Rails ActiveSupport => 2.2). When a calendar with TZInfo time zones is exported, RFC 2445 conforming VTIMEZONE components will be included, allowing other programs to process the result. Enumeration of recurring occurrences. For example, if an Event has one or more recurrence rules, then the occurrences of the event can be enumerated as a series of Event occurrences.
databases/hamsterdb-2.1.11 (Score: 2.533303E-4)
Lightweight Embedded Database Engine
hamsterdb is a lightweight embedded database engine. It is in development for more than three years and concentrates on ease of use, high performance, stability and portability. The hamsterdb API is simple and self-documenting. The interface is similar to other widely-used database engines. Fast algorithms and data structures guarantee high performance for all scenarios. Hamsterdb has hundreds of unittests with a test coverage of over 90%. Each release is tested with thousands of acceptance tests in many different configurations, tested on up to six different hardware architectures and operating systems. Written in plain ANSI-C, hamsterdb runs on many architectures: Intel-compatible (x86, x64), PowerPC, SPARC, ARM, RISC and others. Tested operating systems include Microsoft Windows, Microsoft Windows CE, Linux, SunOS and other Unices.
devel/cjson-1.1.0 (Score: 2.533303E-4)
Fast JSON encoder/decoder for Python
This module implements a very fast JSON encoder/decoder for Python. JSON stands for JavaScript Object Notation and is a text based lightweight data exchange format which is easy for humans to read/write and for machines to parse/generate. JSON is completely language independent and has multiple implementations in most of the programming languages, making it ideal for data exchange and storage. The module is written in C and it is up to 250 times faster when compared to the other python JSON implementations which are written directly in python. This speed gain varies with the complexity of the data and the operation and is the range of 10-200 times for encoding operations and in the range of 100-250 times for decoding operations.
math/foma-0.9.17 (Score: 2.533303E-4)
Toolkit for constructing finite-state automata and transducers
Foma is a compiler, programming language, and C library for constructing finite-state automata and transducers for various uses. It has specific support for many natural language processing applications such as producing morphological analyzers. Although NLP applications are probably the main use of foma, it is sufficiently generic to use for a large number of purposes. The foma interface is similar to the Xerox xfst interface, and supports most of the commands and the regular expression syntax in xfst. Many grammars written for xfst compile out-of-the-box with foma. The library contains efficient implementations of all classical automata/transducer algorithms: determinization, minimization, epsilon-removal, composition, boolean operations. Also, more advanced construction methods are available: context restriction, quotients, first-order regular logic, transducers from replacement rules, etc.
math/spdep-0.6.6 (Score: 2.533303E-4)
Spatial dependence: weighting schemes, statistics, and models
A collection of functions to create spatial weights matrix objects from polygon contiguities, from point patterns by distance and tesselations, for summarising these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial autocorrelation, including global Moran's I, APLE, Geary's C, Hubert/Mantel general cross product statistic, Empirical Bayes estimates and Assuno/Reis Index, Getis/Ord G and multicoloured join count statistics, local Moran's I and Getis/Ord G, saddlepoint approximations and exact tests for global and local Moran's I; and functions for estimating spatial simultaneous autoregressive (SAR) lag and error models, impact measures for lag models, weighted and unweighted SAR and CAR spatial regression models, semi-parametric and Moran eigenvector spatial filtering, GM SAR error models, and generalized spatial two stage least squares models.
net-p2p/microdc2-0.15.6 (Score: 2.533303E-4)
Command-line based Direct Connect client
microdc is a command-line based Direct Connect client written in C by Oskar Liljeblad and designed to build and run on modern POSIX compatible systems. It uses GNU Readline library for user interaction. Despite the command-line user interface, microdc is quite user friendly and simple to use. microdc2 is a future improvement (fork) of the microdc based on Oskar's code version 0.11.0. After version 0.12.0 the project was renamed to microdc2 on Oskar's request. Features of microdc2 include: - Nearly full support of the original Direct Connect protocol - GNU Readline support for command line editing and history - Sensible tab-completion of commands, user names, local files, remote files, speed names, and connection names - One process per connection for optimal transfer rates - Small memory footprint