A Python module to generate XML easily
ASV is a popular Python module to parse or write simple text file formats
such as comma-separated values (CSV), tab-separated values (TSV) and
colon-separated values. It can easily be extended to cope with other
related file formats.
This port installs both a Python module ("ASV"), and an executable
command-line script ("asv").
This release of ASV requires Python 2.0 or later, and is still to be regarded
as a beta version.
Paraphrasing the website:
Python-DSV is a Python module for importing and exporting DSV (delimiter
separated values) files. DSV is a generalization of CSV (comma separated
values). CSV is a common file format used by many programs to import and
export data.
Features:
- Pure Python
- Optional wxPython GUI
- Optional heuristics for determining file format
- Handles embedded quotes, delimiters and newlines
- Customizable error handling
- Simple to use
- Portable
PyStemmer provides access to efficient algorithms for calculating a
"stemmed" form of a word. This is a form with most of the common
morphological endings removed; hopefully representing a common
linguistic base form. This is most useful in building search engines
and information retrieval software; for example, a search with stemming
enabled should be able to find a document containing "cycling" given the
query "cycles".
PyStemmer provides algorithms for several (mainly european) languages,
by wrapping the libstemmer library from the Snowball project in a Python
module. It also provides access to the classic Porter stemming algorithm
for english: although this has been superceded by an improved algorithm,
the original algorithm may be of interest to information retrieval
researchers wishing to reproduce results of earlier experiments.
You can think of pss as an enhanced grep designed to search
inside source code files. pss is very similar to the Perl ack
tool (see https://bitbucket.org/eliben/pss/wiki/PssAndAck).
pyExcelerator is a Python library that can generate Excel 97+ files and import
Excel 95+ files. It supports Unicode in Excel files, and can use a variety of
formatting features and printing options. It can dump Excel and OLE2 compound
files.
RSS2Gen is a Python library for generating RSS 2.0 feeds.
Libtre is an attempt to create a lightweight, robust, and efficient fully
POSIX compliant regexp matching library. There is still some work left, but
the results so far are promising.
At the core of Libtre is a new algorithm for regular expression matching with
submatch addressing. The algorithm uses linear worst-case time in the length
of the text being searched, and quadratic worst-case time in the length of the
used regular expression. In other words, the time complexity of the algorithm
is O(M2N), where M is the length of the regular expression and N is the length
of the text. The used space is also quadratic on the length of the regex, but
does not depend on the searched string. This quadratic behaviour occurs only
on pathological cases which are probably very rare in practice.
Python bindings for the LT XML API and toolkit.
PyWordNet is a Python interface to the WordNet database of word meanings
and lexical relationships. (A lexical relationship is a relationship
between words, such as synonym, antonym, hypernym ("poodle" -> "dog"),
and hyponym ("poodle" -> "dog").