This is a small shell script intended to be used in portable Unix install
scripts for showing progress bars.
The overall goal is to write a minimally complex shell script (thus a program
that needs no compilation) that is as robust as possible to work on as many
Bourne shells and operating systems as possible, and that implements 'cat'
with an ASCII progress bar and some other nifty features.
This is pure Bourne shell code. (For sh, ash, ksh, zsh, bash, ...)
The script is mainly indented to be used in portable install scripts, where
you can use the body of the script.
The Digester package lets you configure an XML -> Java object mapping module,
which triggers certain actions called rules whenever a particular pattern of
nested XML elements is recognized. A rich set of predefined rules is available
for your use, or you can also create your own. Advanced features of Digester
include:
- Ability to plug in your own pattern matching engine, if the standard one is
not sufficient for your requirements.
- Optional namespace-aware processing, so that you can define rules that are
relevant only to a particular XML namespace.
- Encapsulation of Rules into RuleSets that can be easily and conveniently
reused in more than one application that requires the same type of
processing
gpp is a general-purpose preprocessor with customizable syntax, suitable for a
wide range of preprocessing tasks. Its independence on any programming
language makes it much more versatile than cpp, while its syntax is lighter
and more flexible than that of m4.
gpp is targeted at all common preprocessing tasks where cpp is not suitable
and where no very sophisticated features are needed. In order to be able to
process equally efficiently text files or source code in a variety of
languages, the syntax used by gpp is fully customizable. The handling of
comments and strings is especially advanced.
The module is a probability based, corpus-trained tagger that assigns
POS tags to English text based on a lookup dictionary and probability
values. The tagger determines appropriate tags based on conditional
probabilities - it looks at the preceding tag to figure out what the
appropriate tag is for the current word. Unknown words will be classified
according to word morphology or can be set to be treated as nouns or
other parts of speech.
The tagger also recursively extracts as many nouns and noun phrases as
it can, using a set of regular expressions.
PDF::API2
There seem to be a growing plethora of Perl modules for creating and
manipulating PDF files.
This module is 'The Next Generation' of Text::PDF::API which initially
provided a nice API around the Text::PDF::* modules created by Martin Hosken.
FEATURES
. Works with more than one PDF file open at once
. It presents a object-oriented API to the user
. Supports the 14 base PDF Core Fonts
. Supports TrueType fonts
. Supports Adobe-Type1 Fonts (pfb/pfa/afm)
. Supports native Embedding of bitmap images (jpeg,ppm,png,gif)
. Supports modification of existing pdfs
and import/cloning of pages
AsmXml is a very fast XML parser and decoder for x86 platforms. It
achieves high speed by using the following features:
* Support of an XML subset only
* Written in pure assembler
* Optimized memory accesses
* Parsing and decoding at the same time
This parser is intended for applications that need intensive processing
of XML. This project will likely appeal you if XML parsing is a
bottleneck in your data-flow. It is expecially designed for bulk loads
into databases.
This is not an all-purpose library, it is not designed to be used with
DOM, SAX, XPath and so on. Here, XML is just considered as an
interchange format, not as a working format.
This is a port of the glibc gnu regex engine into perl. There are few
reasons you would need this. The few I can think of include:
0) You wish to use untrusted user expressions in such a way as to be
able to catch errors. Example: eval { alarm 2; m/((){1024}){1024}/ }
is an instant uncatchable segmentation fault. GNU's regexps will still
fail, but in a timeout way rather than an instant segfault way.
1) You wish to have POSIX compliance on ... something ... Perl's
regexps are slightly different -- arguably better, but different.
This module provides functions that deals with formatting data with
Content-Type 'text/plain; format=flowed' as described in RFC2646
(http://www.rfc-editor.org/rfc/rfc2646.txt). In a nutshell,
format=flowed text solves the problem in plain text files where it
is not known which lines can be considered a logical paragraph,
enabling lines to be automatically flowed (wrapped and/or joined)
as appropriate when displaying.
In format=flowed, a soft newline is expressed as " \n", while hard
newlines are expressed as "\n". Soft newlines can be automatically
deleted or inserted as appropriate when the text is reformatted.
This is a Perl extension to XML::Parser. It adds a new 'Style' to
XML::Parser, called 'Dom', that allows XML::Parser to build an Object
Oriented datastructure with a DOM Level 1 compliant interface.
The XML::XQL module implements the XQL (XML Query Language) proposal
submitted to the XSL Working Group in September 1998. The spec can
be found at
http://www.w3.org/TandS/QL/QL98/pp/xql.html
Most of the contents related to the XQL syntax can also be found
in the XML::XQL::Tutorial that comes with this distribution. Note
that XQL is not the same as XML-QL!
xmlwrapp is a modern style C++ library for working with XML data. It provides
a simple and easy to use interface for the very powerful libxml2 XML parser.
Features:
* Tree parsing. XML data is parsed and a tree of xml::node objects is
created. Similar to the DOM.
* Event parsing. XML data is parsed as protected member functions of an
event class are called. Similar to SAX.
* It is easy to construct an XML tree using xml::node objects. Any
xml::node may be inserted into an IOStream causing translation to XML
text data.
* Complete isolation from the backend parser due to the private
implementation (pimpl) idiom.
https://github.com/vslavik/xmlwrapp