Tempita is a small templating language for text substitution.
This isn't meant to be the Next Big Thing in templating; it's
just a handy little templating language for when your project
outgrows string.Template or % substitution. It's small, it
embeds Python in strings, and it doesn't do much else.
agate is a Python data analysis library that is optimized for humans
instead of machines. It is an alternative to numpy and pandas that
solves real-world problems with readable code.
agate was previously known as journalism.
Python interface for XSL Transformations (XSLT) library being developed
for the GNOME project.
Utilities for the documentation of Python modules.
PyEnchant is a set of language bindings and some wrapper classes to make
the excellent Enchant spellchecker available as a Python module.
The bindings are generated using SWIG. It includes all the functionality
of Enchant with the flexibility of Python and a nice 'Pythonic'
object-oriented interface. It also aims to provide some higher-level
functionality than is available in the C API.
Genshi is a Python library that provides an integrated set of components
for parsing, generating, and processing HTML, XML or other textual content
for output generation on the web. The major feature is a template language,
which is heavily inspired by Kid.
Libtre is an attempt to create a lightweight, robust, and efficient fully
POSIX compliant regexp matching library. There is still some work left, but
the results so far are promising.
At the core of Libtre is a new algorithm for regular expression matching with
submatch addressing. The algorithm uses linear worst-case time in the length
of the text being searched, and quadratic worst-case time in the length of the
used regular expression. In other words, the time complexity of the algorithm
is O(M2N), where M is the length of the regular expression and N is the length
of the text. The used space is also quadratic on the length of the regex, but
does not depend on the searched string. This quadratic behaviour occurs only
on pathological cases which are probably very rare in practice.
NLTK is a leading platform for building Python programs to work with human
language data. It provides easy-to-use interfaces to over 50 corpora and
lexical resources such as WordNet, along with a suite of text processing
libraries for classification, tokenization, stemming, tagging, parsing,
and semantic reasoning, and an active discussion forum.
Thanks to a hands-on guide introducing programming fundamentals alongside
topics in computational linguistics, NLTK is suitable for linguists,
engineers, students, educators, researchers, and industry users alike.
NLTK is available for Windows, Mac OS X, and Linux. Best of all, NLTK is
a free, open source, community-driven project.
NLTK has been called "a wonderful tool for teaching, and working in,
computational linguistics using Python" and "an amazing library to play
with natural language".
OpenPyxl is a Python library to read/write Excel 2007 xlsx/xlsm files.
Pygments is a syntax highlighting package written in Python.
It is a generic syntax highlighter for general use in all kinds of software
such as forum systems, wikis or other applications that need to prettify
source code. Highlights are:
* a wide range of common languages and markup formats is supported
* special attention is paid to details, increasing quality by a fair amount
* support for new languages and formats are added easily
* a number of output formats, presently HTML, LaTeX, RTF and ANSI sequences
* it is usable as a command-line tool and as a library