Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short
DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp
reads per hour.
Chemeq is a basic standalone filter written in C++ language, flex and bison.
It takes strings like: 2H2 + O2 ---> 2 H2O and can output pretty LaTeX code,
useful messages and much more. It aims to be embeddable in education tools.
gff2ps is a script program developed with the aim of converting
gff-formatted records into high quality one-dimensional plots in
PostScript. Such plots maybe useful for comparing genomic structures
and to visualizing outputs from genome annotation programs.
MUMmer is a modular system for the rapid whole genome alignment of finished
or draft sequence. This package provides an efficient suffix tree library,
seed-and-extend alignment, SNP detection, repeat detection, and
visualization tools.
HTSlib is an implementation of a unified C library for accessing common file
formats, such as SAM, CRAM, VCF, and BCF, used for high-throughput sequencing
data. It is the core library used by samtools and bcftools.
Migrate estimates effective population sizes and past migration rates between
two or "n" populations assuming a migration matrix model with asymmetric
migration rates and different subpopulation sizes. The n-population migrate
can use sequence data, microsatellite data or electrophoretic data.
MUSCLE is multiple alignment software for protein and nucleotide sequences.
The name stands for multiple sequence comparison by log-expectation.
A range of options is provided that give you the choice of optimizing
accuracy, speed, or some compromise between the two. Default parameters are
those that give the best average accuracy in the published tests. MUSCLE
can achieve both better average accuracy and better speed than CLUSTALW or
T-Coffee, depending on the chosen options.
Citation:
Edgar, R. C. (2004) MUSCLE: multiple sequence alignment with high accuracy
and high throughput. Nucleic Acids Research 32(5): 1792-1797.
Edgar, R. C. (2004) MUSCLE: a multiple sequence alignment method with
reduced time and space complexity. BMC Bioinformatics 5(1): 113.
The NAR paper gives only a brief overview of the algorithm and
implementation details. For a full discussion of the method and many of
the non-default options that it offers, please see the BMC paper.
The NCBI (National Center for Biotechnology Information) development toolkit,
containing various libraries needed by NCBI applications, as well as a
software suite containing, amongst other things, NCBI BLAST 2.0.
From the README:
The NCBI Software Development Toolkit was developed for the production and
distribution of GenBank, Entrez, BLAST, and related services by NCBI. We
make it freely available to the public without restriction to facilitate
the use of NCBI by the scientific community. However, please understand
that while we feel we have done a high quality job, this is not commercial
software.
The documentation lags considerably behind the software and we must make
any changes required by our data production needs. Nontheless, many people
have found it a useful and stable basis for a number of tools and
applications.
Bio::SCF module allows you to read and update (in a restricted
way) SCF chromatographic sequence files. It is an interface to
Roger Staden's io-lib. See the installation directions for further
instructions.
"The SEQIO package is a set of C functions which can read and write
biological sequence files formatted using various file formats and which
can be used to perform database searches on biological databases."
- from the README file