primegen is a small, fast library to generate prime numbers in
order. It generates the 50847534 primes up to 1000000000 in just 8
seconds on a Pentium II-350; it prints them in decimal in just 35
seconds.
primegen can generate primes up to 1000000000000000, although it
is not optimized for primes past 32 bits. It uses the Sieve of Atkin
instead of the traditional Sieve of Eratosthenes.
The Statistics::Basic Perl module provides a number of very basic
statistical parameters, including the mean, the median, the standard
deviation etc. It is reportedly faster than a similar module,
Statistics::Descriptive.
Suppose you flip a coin 100 times, and it turns up heads 70 times. Is
the coin fair?
Suppose you roll a die 100 times, and it shows 30 sixes. Is the die
loaded?
In statistics, the chi-square test calculates "how random" a series of
numbers is. But it doesn't simply say "yes" or "no". Instead, it gives
you a confidence interval, which sets upper and lower bounds on the
likelihood that the variation in your data is due to chance. See the
examples below.
There's just one function in this module: chisquare(). Instead of
returning the bounds on the confidence interval in a tidy little
two-element array, it returns an English string. This was a deliberate
design choice---many people misinterpret chi-square results, and the
string helps clarify the meaning.
-Anton
<tobez@FreeBSD.org>
The "Statistics::Contingency" class helps you calculate several useful
statistical measures based on 2x2 "contingency tables". I use these measures
to help judge the results of automatic text categorization experiments, but
they are useful in other situations as well.
The general usage flow is to tally a whole bunch of results in the
"Statistics::Contingency" object, then query that object to obtain the
measures you are interested in. When all results have been collected, you
can get a report on accuracy, precision, recall, F1, and so on, with both
macro-averaging and micro-averaging over categories.
Statistics::Frequency is a simple class for counting elements, in other
words, their frequencies. The goal of Statistics::Frequency is simply to
be provide container for sets of elements and their respective frequencies.
The Statistics::Lite module is a lightweight, functional alternative
to larger, more complete, object-oriented statistics packages.
As such, it is likely to be better suited, in general, to smaller
data sets.
This package attempts to make it easier to write scripts that use
BigInts/BigFloats in a transparent way. They use the rewritten
versions of Math::BigInt and Math::BigFloat, Math::BigRat (for
bigrat) and optionally Math::BigInt::Lite.
Statistics::R will permit the control of the R (R-project) interpreter
through Perl in different architectures and OS.
Regression.pm is a multivariate linear regression package.
That is, it estimates the c coefficients for a line-fit of the type
y= c(0)*x(0) + c(1)*x1 + c(2)*x2 + ... + c(k)*xk
given a data set of N observations, each with k independent x variables
and one y variable. Naturally, N must be greater than k---and preferably
considerably greater. Any reasonable undergraduate statistics book will
explain what a regression is. Most of the time, the user will provide a
constant ('1') as x(0) for each observation in order to allow the
regression package to fit an intercept.
This is the Statistical T-Test module to compare 2 independentsamples.
It takes 2 array of point measures, compute the confidence intervals
using the PointEstimation module (which is also included in this package)
and use the T-statistic to test the null hypothesis. If the null hypothesis
is rejected, the difference will be given as the lower_clm and upper_clm of
the TTest object.