package owl

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

Statistics: random number generators, PDF and CDF functions, and hypothesis tests.

The module includes some basic statistical functions such as mean, variance, skew, and etc. It also includes the following three submodules.

The Rnd module provides random number generators of various distributions.

The Pdf module provides a range of probability density/mass functions of different distributions.

The Cdf module provides cumulative distribution functions.

Please refer to GSL documentation for details.

Randomisation functions
val seed : int -> unit

seed x sets x as seed for the internal random number generator.

val shuffle : 'a array -> 'a array

shuffle x return a new array of the shuffled x.

val choose : 'a array -> int -> 'a array

choose x n draw n samples from x without replecement.

val sample : 'a array -> int -> 'a array

sample x n draw n samples from x with replacement.

Basic statistical functions
val mean : ?w:float array -> float array -> float
val variance : ?w:float array -> ?mean:float -> float array -> float
val std : ?w:float array -> ?mean:float -> float array -> float

std x calculates the standard deviation of x.

val sem : ?w:float array -> ?mean:float -> float array -> float

sem x calculates the standard error of x, also referred to as standard error of the mean.

val absdev : ?w:float array -> ?mean:float -> float array -> float
val skew : ?w:float array -> ?mean:float -> ?sd:float -> float array -> float
val kurtosis : ?w:float array -> ?mean:float -> ?sd:float -> float array -> float

kurtosis x return the Pearson's kurtosis of x.

val central_moment : int -> float array -> float
val covariance : ?mean0:float -> ?mean1:float -> float array -> float array -> float
val correlation : float array -> float array -> float
val pearson_r : float array -> float array -> float
val kendall_tau : float array -> float array -> float
val spearman_rho : float array -> float array -> float
val autocorrelation : ?lag:int -> float array -> float
val median : float array -> float

median x returns the median of x.

val percentile : float array -> float -> float

percentile x p returns the p percentile of the data x. p is between 0. and 1. x does not need to be sorted.

val first_quartile : float array -> float

first_quartile x returns the first quartile of x, i.e., 25 percentiles.

val third_quartile : float array -> float

third_quartile x returns the third quartile of x, i.e., 75 percentiles.

val min : float array -> float
val max : float array -> float
val minmax : float array -> float * float
val min_i : float array -> float * int
val max_i : float array -> float * int
val minmax_i : float array -> float * int * float * int
val sort : ?inc:bool -> float array -> float array
val argsort : ?inc:bool -> float array -> int array
val rank : ?ties_strategy:[ `Average | `Min | `Max ] -> float array -> float array

Computes sample's ranks.

The ranking order is from the smallest one to the largest. For example rank [|54.; 74.; 55.; 86.; 56.|] returns [|1.; 4.; 2.; 5.; 3.|]. Note that the ranking starts with one!

ties_strategy controls which ranks are assigned to equal values:

  • `Average the mean of ranks should be assigned to each value. Default.
  • `Min the minimum of ranks is assigned to each value.
  • `Max the maximum of ranks is assigned to each value.
val histogram : float array -> int -> int array
val ecdf : float array -> float array * float array

ecdf x returns (x',f) which are the empirical cumulative distribution function f of x at points x'. x' is just x sorted in increasing order with duplicates removed.

val z_score : mu:float -> sigma:float -> float array -> float array
val t_score : float array -> float array
val normlise_pdf : float array -> float array
MCMC: Markov Chain Monte Carlo
val metropolis_hastings : (float array -> float) -> float array -> int -> float array array

TODO: metropolis_hastings f p n is Metropolis-Hastings MCMC algorithm. f is pdf of the p

val gibbs_sampling : (float array -> int -> float) -> float array -> int -> float array array

TODO: gibbs_sampling f p n is Gibbs sampler. f is a sampler based on the full conditional function of all variables

Hypothesis tests
type tail =
  1. | BothSide
  2. | RightSide
  3. | LeftSide
    (*

    Types of alternative hypothesis tests: one-side, left-side, or right-side.

    *)
val z_test : mu:float -> sigma:float -> ?alpha:float -> ?side:tail -> float array -> bool * float * float

z_test ~mu ~sigma ~alpha ~side x returns a test decision for the null hypothesis that the data x comes from a normal distribution with mean mu and a standard deviation sigma, using the z-test of alpha significance level. The alternative hypothesis is that the mean is not mu.

The result h,p,z: h is true if the test rejects the null hypothesis at the alpha significance level, and false otherwise. p is the p-value and z is the z-score.

val t_test : mu:float -> ?alpha:float -> ?side:tail -> float array -> bool * float * float

t_test ~mu ~alpha ~side x returns a test decision of one-sample t-test which is a parametric test of the location parameter when the population standard deviation is unknown. mu is population mean, alpha is the significance level.

val t_test_paired : ?alpha:float -> ?side:tail -> float array -> float array -> bool * float * float

t_test_paired ~alpha ~side x y returns a test decision for the null hypothesis that the data in x – y comes from a normal distribution with mean equal to zero and unknown variance, using the paired-sample t-test.

val t_test_unpaired : ?alpha:float -> ?side:tail -> ?equal_var:bool -> float array -> float array -> bool * float * float

t_test_unpaired ~alpha ~side ~equal_var x y returns a test decision for the null hypothesis that the data in vectors x and y comes from independent random samples from normal distributions with equal means and equal but unknown variances, using the two-sample t-test. The alternative hypothesis is that the data in x and y comes from populations with unequal means.

equal_var indicates whether two samples have the same variance. If the two variances are not the same, the test is referred to as Welche's t-test.

val var_test : ?alpha:float -> ?side:tail -> var:float -> float array -> bool * float * float

var_test ~alpha ~side ~var x returns a test decision for the null hypothesis that the data in x comes from a normal distribution with variance var, using the chi-square variance test. The alternative hypothesis is that x comes from a normal distribution with a different variance.

val jb_test : ?alpha:float -> float array -> bool * float * float

jb_test ~alpha x returns a test decision for the null hypothesis that the data x comes from a normal distribution with an unknown mean and variance, using the Jarque-Bera test.

val fisher_test : ?alpha:float -> ?side:tail -> int -> int -> int -> int -> bool * float * float

fisher_test ~alpha ~side a b c d fisher's exact test for contingency table |a, b| |c, d| . The result h,p,z: h is true if the test rejects the null hypothesis at the alpha significance level, and false otherwise. p is the p-value and z is prior odds ratio.

val runs_test : ?alpha:float -> ?side:tail -> ?v:float -> float array -> bool * float * float

runs_test ~alpha ~v x returns a test decision for the null hypothesis that the data x comes in random order, against the alternative that they do not, by runnign Wald–Wolfowitz runs test. The test is based on the number of runs of consecutive values above or below the mean of x. ~v is the reference value, the default value is the median of x.

val mannwhitneyu : ?alpha:float -> ?side:tail -> float array -> float array -> bool * float * float

mannwhitneyu ~alpha ~side x y Computes the Mann-Whitney rank test on samples x and y. If length of each sample less than 10 and no ties, then using exact test (see paper Ying Kuen Cheung and Jerome H. Klotz (1997) The Mann Whitney Wilcoxon distribution using linked list Statistica Sinica 7 805-813), else usning asymptotic normal distribution.

val wilcoxon : ?alpha:float -> ?side:tail -> float array -> float array -> bool * float * float
Random numbers, PDF, and CDF
module Rnd : sig ... end

Rnd module is for generating random variables of various distributions.

module Pdf : sig ... end

Pdf module provides the probability density functions of various random number distribution.

module Cdf : sig ... end

For each random variable distribution, the module includes four corresponding functions (if well-defined).