package scipy

You can search for identifiers within the package.

in-package search v0.2.0

package scipy

scipy
- Scipy
  - Cluster
    
    Hierarchy
    
    ClusterNode
    
    ClusterWarning
    
    Deque
    
    Vq
    
    ClusterError
    
    Deque
  - Conftest
    
    FPUModeChangeWarning
    
    LooseVersion
  - Constants
    
    Codata
    
    ConstantWarning
    
    Constants
  - Fft
  - Fftpack
    
    Basic
    
    Convolve
    
    Helper
    
    Pseudo_diffs
    
    Realtransforms
  - Integrate
    
    AccuracyWarning
    
    BDF
    
    Complex_ode
    
    DOP853
    
    DenseOutput
    
    IntegrationWarning
    
    LSODA
    
    Lsoda
    
    Ode
    
    OdeSolution
    
    OdeSolver
    
    Odepack
    
    ODEintWarning
    
    Quadpack
    
    Error
    
    Partial
    
    RK23
    
    RK45
    
    Radau
    
    Vode
  - Interpolate
    
    Akima1DInterpolator
    
    BPoly
    
    BSpline
    
    BarycentricInterpolator
    
    BivariateSpline
    
    CloughTocher2DInterpolator
    
    CubicHermiteSpline
    
    CubicSpline
    
    Dfitpack
    
    Fitpack
    
    Fitpack2
    
    SphereBivariateSpline
    
    Interp1d
    
    Interp2d
    
    Interpnd
    
    GradientEstimationWarning
    
    NDInterpolatorBase
    
    Interpolate
    
    Intp
    
    Poly1d
    
    InterpolatedUnivariateSpline
    
    KroghInterpolator
    
    LSQBivariateSpline
    
    LSQSphereBivariateSpline
    
    LSQUnivariateSpline
    
    LinearNDInterpolator
    
    NdPPoly
    
    Ndgriddata
    
    CKDTree
    
    NearestNDInterpolator
    
    PPoly
    
    Pchip
    
    PchipInterpolator
    
    Polyint
    
    Rbf
    
    Rbf'
    
    RectBivariateSpline
    
    RectSphereBivariateSpline
    
    RegularGridInterpolator
    
    SmoothBivariateSpline
    
    SmoothSphereBivariateSpline
    
    UnivariateSpline
  - Io
    
    FortranEOFError
    
    FortranFile
    
    FortranFormattingError
    
    Harwell_boeing
    
    HBFile
    
    HBInfo
    
    HBMatrixType
    
    Hb
    
    Csc_matrix
    
    ExpFormat
    
    FortranFormatParser
    
    IntFormat
    
    LineOverflow
    
    MalformedHeader
    
    Idl
    
    AttrDict
    
    ObjectPointer
    
    Pointer
    
    Matlab
    
    Mio
    
    MatFile4Reader
    
    MatFile4Writer
    
    MatFile5Reader
    
    MatFile5Writer
    
    Mio4
    
    MatFileReader
    
    VarHeader4
    
    VarReader4
    
    VarWriter4
    
    Mio5
    
    BytesIO
    
    EmptyStructMarker
    
    MatFileReader
    
    MatReadError
    
    MatReadWarning
    
    MatWriteError
    
    Mat_struct
    
    MatlabFunction
    
    MatlabObject
    
    VarReader5
    
    VarWriter5
    
    ZlibInputStream
    
    Mio5_params
    
    MatlabOpaque
    
    Mio5_utils
    
    Csc_matrix
    
    VarHeader5
    
    Mio_utils
    
    Miobase
    
    MatVarReader
    
    Streams
    
    GenericStream
    
    Mmio
    
    Coo_matrix
    
    MMFile
    
    Ndarray
    
    Netcdf
    
    Dtype
    
    OrderedDict
    
    Netcdf_file
    
    Netcdf_variable
  - Linalg
    
    Basic
    
    Blas
    
    Cython_blas
    
    Cython_lapack
    
    Decomp
    
    Inexact
    
    Decomp_cholesky
    
    Decomp_lu
    
    Decomp_qr
    
    Decomp_schur
    
    Single
    
    Decomp_svd
    
    Flinalg
    
    Lapack
    
    LinAlgError
    
    LinAlgWarning
    
    Matfuncs
    
    Single
    
    Misc
    
    Special_matrices
  - Misc
    
    Doccer
  - Ndimage
    
    Filters
    
    Iterable
    
    Fourier
    
    Interpolation
    
    Measurements
    
    Morphology
  - Obj
  - Odr
    
    Data
    
    Model
    
    Models
    
    ODR
    
    OdrError
    
    OdrStop
    
    OdrWarning
    
    Odrpack
    
    Output
    
    RealData
  - Optimize
    
    BFGS
    
    Bounds
    
    Cobyla
    
    Izip
    
    HessianUpdateStrategy
    
    LbfgsInvHessProduct
    
    Lbfgsb
    
    Float64
    
    LinearOperator
    
    MemoizeJac
    
    LinearConstraint
    
    Linesearch
    
    LineSearchWarning
    
    Minpack
    
    Error
    
    Finfo
    
    Minpack2
    
    ModuleTNC
    
    Nonlin
    
    Anderson
    
    BroydenFirst
    
    BroydenSecond
    
    DiagBroyden
    
    ExcitingMixing
    
    GenericBroyden
    
    InverseJacobian
    
    Jacobian
    
    KrylovJacobian
    
    LinearMixing
    
    LowRankMatrix
    
    NoConvergence
    
    TerminationCondition
    
    NonlinearConstraint
    
    Optimize
    
    Brent
    
    LineSearchWarning
    
    MapWrapper
    
    ScalarFunction
    
    OptimizeResult
    
    OptimizeWarning
    
    RootResults
    
    SR1
    
    Slsqp
    
    Finfo
    
    Tnc
    
    MemoizeJac
    
    Zeros
    
    TOMS748Solver
  - Setup
  - Signal
    
    BadCoefficients
    
    Bsplines
    
    Dlti
    
    Filter_design
    
    Sp_fft
    
    Fir_filter_design
    
    Lti
    
    Lti_conversion
    
    Ltisys
    
    Bunch
    
    LinearTimeInvariant
    
    StateSpaceContinuous
    
    StateSpaceDiscrete
    
    TransferFunctionContinuous
    
    TransferFunctionDiscrete
    
    ZerosPolesGainContinuous
    
    ZerosPolesGainDiscrete
    
    Signaltools
    
    CKDTree
    
    Sp_fft
    
    Sigtools
    
    Spectral
    
    Sp_fft
    
    Spline
    
    StateSpace
    
    TransferFunction
    
    Waveforms
    
    Wavelets
    
    Windows
    
    Windows
    
    Sp_fft
    
    ZerosPolesGain
  - Sparse
    
    Base
    
    SparseFormatWarning
    
    Bsr
    
    Bsr_matrix
    
    Compressed
    
    IndexMixin
    
    Construct
    
    Partial
    
    Coo
    
    Coo_matrix
    
    Csc
    
    Csc_matrix
    
    Csgraph
    
    NegativeCycleError
    
    Csr
    
    Csr_matrix
    
    Data
    
    Dia
    
    Dia_matrix
    
    Dok
    
    IndexMixin
    
    Dok_matrix
    
    Extract
    
    Lil
    
    IndexMixin
    
    Lil_matrix
    
    Linalg
    
    Arpack
    
    IterInv
    
    IterOpInv
    
    LuInv
    
    ReentrancyLock
    
    SpLuInv
    
    ArpackError
    
    ArpackNoConvergence
    
    Dsolve
    
    Linsolve
    
    Eigen
    
    Arpack
    
    Interface
    
    IdentityOperator
    
    MatrixLinearOperator
    
    Isolve
    
    Iterative
    
    Utils
    
    Matrix
    
    Iterative
    
    LinearOperator
    
    Linsolve
    
    Matfuncs
    
    MatrixPowerOperator
    
    ProductOperator
    
    MatrixRankWarning
    
    SuperLU
    
    Utils
    
    IdentityOperator
    
    Matrix
    
    SparseEfficiencyWarning
    
    SparseWarning
    
    Spmatrix
    
    Sputils
  - Spatial
    
    CKDTree
    
    Ckdtree
    
    CKDTreeNode
    
    Coo_entries
    
    Ordered_pairs
    
    ConvexHull
    
    Delaunay
    
    Distance
    
    MetricInfo
    
    Partial
    
    HalfspaceIntersection
    
    KDTree
    
    Kdtree
    
    Qhull
    
    QhullError
    
    Rectangle
    
    SphericalVoronoi
    
    Transform
    
    Rotation
    
    Rotation'
    
    RotationSpline
    
    Slerp
    
    Voronoi
  - Special
    
    Cython_special
    
    Errstate
    
    Orthogonal
    
    Cephes
    
    Orthopoly1d
    
    Sf_error
    
    Specfun
    
    SpecialFunctionError
    
    SpecialFunctionWarning
    
    Spfun_stats
  - Stats
    
    Contingency
    
    Distributions
    
    Alpha_gen
    
    Anglit_gen
    
    Arcsine_gen
    
    Argus_gen
    
    Bernoulli_gen
    
    Beta_gen
    
    Betabinom_gen
    
    Betaprime_gen
    
    Binom_gen
    
    Boltzmann_gen
    
    Bradford_gen
    
    Burr12_gen
    
    Burr_gen
    
    Cauchy_gen
    
    Chi2_gen
    
    Chi_gen
    
    Cosine_gen
    
    Crystalball_gen
    
    Dgamma_gen
    
    Dlaplace_gen
    
    Dweibull_gen
    
    Erlang_gen
    
    Expon_gen
    
    Exponnorm_gen
    
    Exponpow_gen
    
    Exponweib_gen
    
    F_gen
    
    Fatiguelife_gen
    
    Fisk_gen
    
    Foldcauchy_gen
    
    Foldnorm_gen
    
    Frechet_l_gen
    
    Frechet_r_gen
    
    Gamma_gen
    
    Gausshyper_gen
    
    Genexpon_gen
    
    Genextreme_gen
    
    Gengamma_gen
    
    Genhalflogistic_gen
    
    Geninvgauss_gen
    
    Genlogistic_gen
    
    Gennorm_gen
    
    Genpareto_gen
    
    Geom_gen
    
    Gilbrat_gen
    
    Gompertz_gen
    
    Gumbel_l_gen
    
    Gumbel_r_gen
    
    Halfcauchy_gen
    
    Halfgennorm_gen
    
    Halflogistic_gen
    
    Halfnorm_gen
    
    Hypergeom_gen
    
    Hypsecant_gen
    
    Invgamma_gen
    
    Invgauss_gen
    
    Invweibull_gen
    
    Johnsonsb_gen
    
    Johnsonsu_gen
    
    Kappa3_gen
    
    Kappa4_gen
    
    Ksone_gen
    
    Kstwo_gen
    
    Kstwobign_gen
    
    Laplace_gen
    
    Levy_gen
    
    Levy_l_gen
    
    Levy_stable_gen
    
    Loggamma_gen
    
    Logistic_gen
    
    Loglaplace_gen
    
    Lognorm_gen
    
    Logser_gen
    
    Lomax_gen
    
    Maxwell_gen
    
    Mielke_gen
    
    Moyal_gen
    
    Nakagami_gen
    
    Nbinom_gen
    
    Ncf_gen
    
    Nct_gen
    
    Ncx2_gen
    
    Norm_gen
    
    Norminvgauss_gen
    
    Pareto_gen
    
    Pearson3_gen
    
    Planck_gen
    
    Poisson_gen
    
    Powerlaw_gen
    
    Powerlognorm_gen
    
    Powernorm_gen
    
    Randint_gen
    
    Rayleigh_gen
    
    Rdist_gen
    
    Recipinvgauss_gen
    
    Reciprocal_gen
    
    Rice_gen
    
    Rv_frozen
    
    Semicircular_gen
    
    Skellam_gen
    
    Skew_norm_gen
    
    T_gen
    
    Trapz_gen
    
    Triang_gen
    
    Truncexpon_gen
    
    Truncnorm_gen
    
    Tukeylambda_gen
    
    Uniform_gen
    
    Vonmises_gen
    
    Wald_gen
    
    Weibull_max_gen
    
    Weibull_min_gen
    
    Wrapcauchy_gen
    
    Yulesimon_gen
    
    Zipf_gen
    
    F_onewayBadInputSizesWarning
    
    F_onewayConstantInputWarning
    
    Gaussian_kde
    
    Kde
    
    Morestats
    
    AndersonResult
    
    Anderson_ksampResult
    
    AnsariResult
    
    BartlettResult
    
    FlignerResult
    
    LeveneResult
    
    Mean
    
    Rv_generic
    
    ShapiroResult
    
    Std_dev
    
    Variance
    
    WilcoxonResult
    
    Mstats
    
    Mstats_basic
    
    BrunnerMunzelResult
    
    DescribeResult
    
    F_onewayResult
    
    FriedmanchisquareResult
    
    KendalltauResult
    
    KruskalResult
    
    KurtosistestResult
    
    LinregressResult
    
    MannwhitneyuResult
    
    ModeResult
    
    NormaltestResult
    
    PointbiserialrResult
    
    SkewtestResult
    
    SpearmanrResult
    
    Ttest_1sampResult
    
    Ttest_indResult
    
    Ttest_relResult
    
    Mstats_extras
    
    MaskedArray
    
    Mvn
    
    PearsonRConstantInputWarning
    
    PearsonRNearConstantInputWarning
    
    Rv_continuous
    
    Rv_discrete
    
    Rv_histogram
    
    SpearmanRConstantInputWarning
    
    Statlib
    
    Stats
    
    BrunnerMunzelResult
    
    CumfreqResult
    
    DescribeResult
    
    F_onewayResult
    
    FriedmanchisquareResult
    
    HistogramResult
    
    Jarque_beraResult
    
    KendalltauResult
    
    KruskalResult
    
    Ks_2sampResult
    
    KstestResult
    
    KurtosistestResult
    
    MGCResult
    
    MannwhitneyuResult
    
    MapWrapper
    
    ModeResult
    
    NormaltestResult
    
    PointbiserialrResult
    
    Power_divergenceResult
    
    RanksumsResult
    
    RelfreqResult
    
    RepeatedResults
    
    SigmaclipResult
    
    SkewtestResult
    
    SpearmanrResult
    
    Ttest_1sampResult
    
    Ttest_indResult
    
    Ttest_relResult
    
    WeightedTauResult
  - Version
  - Wrap_utils
    
    Types
  - Wrap_version

Legend:
Library
Module
Module type
Parameter
Class
Class type

val get_py : string -> Py.Object.t

Get an attribute of this module as a Py.Object.t. This is useful to pass a Python function to another function.

module BrunnerMunzelResult : sig ... end

module CumfreqResult : sig ... end

module DescribeResult : sig ... end

module F_onewayResult : sig ... end

module FriedmanchisquareResult : sig ... end

module HistogramResult : sig ... end

module Jarque_beraResult : sig ... end

module KendalltauResult : sig ... end

module KruskalResult : sig ... end

module Ks_2sampResult : sig ... end

module KstestResult : sig ... end

module KurtosistestResult : sig ... end

module MGCResult : sig ... end

module MannwhitneyuResult : sig ... end

module MapWrapper : sig ... end

module ModeResult : sig ... end

module NormaltestResult : sig ... end

module PointbiserialrResult : sig ... end

module Power_divergenceResult : sig ... end

module RanksumsResult : sig ... end

module RelfreqResult : sig ... end

module RepeatedResults : sig ... end

module SigmaclipResult : sig ... end

module SkewtestResult : sig ... end

module SpearmanrResult : sig ... end

module Ttest_1sampResult : sig ... end

module Ttest_indResult : sig ... end

module Ttest_relResult : sig ... end

module WeightedTauResult : sig ... end

val array : 
  ?dtype:Np.Dtype.t ->
  ?copy:bool ->
  ?order:[ `K | `A | `C | `F ] ->
  ?subok:bool ->
  ?ndmin:int ->
  object_:[> `Ndarray ] Np.Obj.t ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t

array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0)

Create an array.

Parameters ---------- object : array_like An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence. dtype : data-type, optional The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. copy : bool, optional If true (default), then the object is copied. Otherwise, a copy will only be made if __array__ returns a copy, if obj is a nested sequence, or if a copy is needed to satisfy any of the other requirements (`dtype`, `order`, etc.). order : 'K', 'A', 'C', 'F', optional Specify the memory layout of the array. If object is not an array, the newly created array will be in C order (row major) unless 'F' is specified, in which case it will be in Fortran order (column major). If object is an array the following holds.

===== ========= =================================================== order no copy copy=True ===== ========= =================================================== 'K' unchanged F & C order preserved, otherwise most similar order 'A' unchanged F order if input is F and not C, otherwise C order 'C' C order C order 'F' F order F order ===== ========= ===================================================

When ``copy=False`` and a copy is made for other reasons, the result is the same as if ``copy=True``, with some exceptions for `A`, see the Notes section. The default order is 'K'. subok : bool, optional If True, then sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default). ndmin : int, optional Specifies the minimum number of dimensions that the resulting array should have. Ones will be pre-pended to the shape as needed to meet this requirement.

Returns ------- out : ndarray An array object satisfying the specified requirements.

See Also -------- empty_like : Return an empty array with shape and type of input. ones_like : Return an array of ones with shape and type of input. zeros_like : Return an array of zeros with shape and type of input. full_like : Return a new array with shape of input filled with value. empty : Return a new uninitialized array. ones : Return a new array setting values to one. zeros : Return a new array setting values to zero. full : Return a new array of given shape filled with value.

Notes ----- When order is 'A' and `object` is an array in neither 'C' nor 'F' order, and a copy is forced by a change in dtype, then the order of the result is not necessarily 'C' as expected. This is likely a bug.

Examples -------- >>> np.array(1, 2, 3) array(1, 2, 3)

Upcasting:

>>> np.array(1, 2, 3.0) array( 1., 2., 3.)

More than one dimension:

>>> np.array([1, 2], [3, 4]) array([1, 2], [3, 4])

Minimum dimensions 2:

>>> np.array(1, 2, 3, ndmin=2) array([1, 2, 3])

Type provided:

>>> np.array(1, 2, 3, dtype=complex) array( 1.+0.j, 2.+0.j, 3.+0.j)

Data-type consisting of more than one element:

>>> x = np.array((1,2),(3,4),dtype=('a','<i4'),('b','<i4')) >>> x'a' array(1, 3)

Creating an array from sub-classes:

>>> np.array(np.mat('1 2; 3 4')) array([1, 2], [3, 4])

>>> np.array(np.mat('1 2; 3 4'), subok=True) matrix([1, 2], [3, 4])

val asarray : 
  ?dtype:Np.Dtype.t ->
  ?order:[ `C | `F ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t

Convert the input to an array.

Parameters ---------- a : array_like Input data, in any form that can be converted to an array. This includes lists, lists of tuples, tuples, tuples of tuples, tuples of lists and ndarrays. dtype : data-type, optional By default, the data-type is inferred from the input data. order : 'C', 'F', optional Whether to use row-major (C-style) or column-major (Fortran-style) memory representation. Defaults to 'C'.

Returns ------- out : ndarray Array interpretation of `a`. No copy is performed if the input is already an ndarray with matching dtype and order. If `a` is a subclass of ndarray, a base class ndarray is returned.

See Also -------- asanyarray : Similar function which passes through subclasses. ascontiguousarray : Convert input to a contiguous array. asfarray : Convert input to a floating point ndarray. asfortranarray : Convert input to an ndarray with column-major memory order. asarray_chkfinite : Similar function which checks input for NaNs and Infs. fromiter : Create an array from an iterator. fromfunction : Construct an array by executing a function on grid positions.

Examples -------- Convert a list into an array:

>>> a = 1, 2 >>> np.asarray(a) array(1, 2)

Existing arrays are not copied:

>>> a = np.array(1, 2) >>> np.asarray(a) is a True

If `dtype` is set, array is copied only if dtype does not match:

>>> a = np.array(1, 2, dtype=np.float32) >>> np.asarray(a, dtype=np.float32) is a True >>> np.asarray(a, dtype=np.float64) is a False

Contrary to `asanyarray`, ndarray subclasses are not passed through:

>>> issubclass(np.recarray, np.ndarray) True >>> a = np.array((1.0, 2), (3.0, 4), dtype='f4,i4').view(np.recarray) >>> np.asarray(a) is a False >>> np.asanyarray(a) is a True

val brunnermunzel : 
  ?alternative:[ `Two_sided | `Less | `Greater ] ->
  ?distribution:[ `T | `Normal ] ->
  ?nan_policy:[ `Propagate | `Raise | `Omit ] ->
  x:Py.Object.t ->
  y:Py.Object.t ->
  unit ->
  float * float

Compute the Brunner-Munzel test on samples x and y.

The Brunner-Munzel test is a nonparametric test of the null hypothesis that when values are taken one by one from each group, the probabilities of getting large values in both groups are equal. Unlike the Wilcoxon-Mann-Whitney's U test, this does not require the assumption of equivariance of two groups. Note that this does not assume the distributions are same. This test works on two independent samples, which may have different sizes.

Parameters ---------- x, y : array_like Array of samples, should be one-dimensional. alternative : 'two-sided', 'less', 'greater', optional Defines the alternative hypothesis. The following options are available (default is 'two-sided'):

* 'two-sided' * 'less': one-sided * 'greater': one-sided distribution : 't', 'normal', optional Defines how to get the p-value. The following options are available (default is 't'):

* 't': get the p-value by t-distribution * 'normal': get the p-value by standard normal distribution. nan_policy : 'propagate', 'raise', 'omit', optional Defines how to handle when input contains nan. The following options are available (default is 'propagate'):

* 'propagate': returns nan * 'raise': throws an error * 'omit': performs the calculations ignoring nan values

Returns ------- statistic : float The Brunner-Munzer W statistic. pvalue : float p-value assuming an t distribution. One-sided or two-sided, depending on the choice of `alternative` and `distribution`.

See Also -------- mannwhitneyu : Mann-Whitney rank test on two samples.

Notes ----- Brunner and Munzel recommended to estimate the p-value by t-distribution when the size of data is 50 or less. If the size is lower than 10, it would be better to use permuted Brunner Munzel test (see 2_).

References ---------- .. 1 Brunner, E. and Munzel, U. 'The nonparametric Benhrens-Fisher problem: Asymptotic theory and a small-sample approximation'. Biometrical Journal. Vol. 42(2000): 17-25. .. 2 Neubert, K. and Brunner, E. 'A studentized permutation test for the non-parametric Behrens-Fisher problem'. Computational Statistics and Data Analysis. Vol. 51(2007): 5192-5204.

Examples -------- >>> from scipy import stats >>> x1 = 1,2,1,1,1,1,1,1,1,1,2,4,1,1 >>> x2 = 3,3,4,3,1,2,3,1,1,5,4 >>> w, p_value = stats.brunnermunzel(x1, x2) >>> w 3.1374674823029505 >>> p_value 0.0057862086661515377

val cdist : 
  ?metric:[ `Callable of Py.Object.t | `S of string ] ->
  ?kwargs:(string * Py.Object.t) list ->
  xa:[> `Ndarray ] Np.Obj.t ->
  xb:[> `Ndarray ] Np.Obj.t ->
  Py.Object.t list ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t

Compute distance between each pair of the two collections of inputs.

See Notes for common calling conventions.

Parameters ---------- XA : ndarray An :math:`m_A` by :math:`n` array of :math:`m_A` original observations in an :math:`n`-dimensional space. Inputs are converted to float type. XB : ndarray An :math:`m_B` by :math:`n` array of :math:`m_B` original observations in an :math:`n`-dimensional space. Inputs are converted to float type. metric : str or callable, optional The distance metric to use. If a string, the distance function can be 'braycurtis', 'canberra', 'chebyshev', 'cityblock', 'correlation', 'cosine', 'dice', 'euclidean', 'hamming', 'jaccard', 'jensenshannon', 'kulsinski', 'mahalanobis', 'matching', 'minkowski', 'rogerstanimoto', 'russellrao', 'seuclidean', 'sokalmichener', 'sokalsneath', 'sqeuclidean', 'wminkowski', 'yule'. *args : tuple. Deprecated. Additional arguments should be passed as keyword arguments **kwargs : dict, optional Extra arguments to `metric`: refer to each metric documentation for a list of all possible arguments.

Some possible arguments:

p : scalar The p-norm to apply for Minkowski, weighted and unweighted. Default: 2.

w : ndarray The weight vector for metrics that support weights (e.g., Minkowski).

V : ndarray The variance vector for standardized Euclidean. Default: var(vstack(XA, XB), axis=0, ddof=1)

VI : ndarray The inverse of the covariance matrix for Mahalanobis. Default: inv(cov(vstack(XA, XB.T))).T

out : ndarray The output array If not None, the distance matrix Y is stored in this array. Note: metric independent, it will become a regular keyword arg in a future scipy version

Returns ------- Y : ndarray A :math:`m_A` by :math:`m_B` distance matrix is returned. For each :math:`i` and :math:`j`, the metric ``dist(u=XAi, v=XBj)`` is computed and stored in the :math:`ij` th entry.

Raises ------ ValueError An exception is thrown if `XA` and `XB` do not have the same number of columns.

Notes ----- The following are common calling conventions:

1. ``Y = cdist(XA, XB, 'euclidean')``

Computes the distance between :math:`m` points using Euclidean distance (2-norm) as the distance metric between the points. The points are arranged as :math:`m` :math:`n`-dimensional row vectors in the matrix X.

2. ``Y = cdist(XA, XB, 'minkowski', p=2.)``

Computes the distances using the Minkowski distance :math:`||u-v||_p` (:math:`p`-norm) where :math:`p \geq 1`.

3. ``Y = cdist(XA, XB, 'cityblock')``

Computes the city block or Manhattan distance between the points.

4. ``Y = cdist(XA, XB, 'seuclidean', V=None)``

Computes the standardized Euclidean distance. The standardized Euclidean distance between two n-vectors ``u`` and ``v`` is

.. math::

\sqrt\sum {(u_i-v_i)^2 / V[x_i]

}

V is the variance vector; Vi is the variance computed over all the i'th components of the points. If not passed, it is automatically computed.

5. ``Y = cdist(XA, XB, 'sqeuclidean')``

Computes the squared Euclidean distance :math:`||u-v||_2^2` between the vectors.

6. ``Y = cdist(XA, XB, 'cosine')``

Computes the cosine distance between vectors u and v,

.. math::

1 - \fracu \cdot v { ||u|| _2 ||v|| _2

}

where :math:`||*||_2` is the 2-norm of its argument ``*``, and :math:`u \cdot v` is the dot product of :math:`u` and :math:`v`.

7. ``Y = cdist(XA, XB, 'correlation')``

Computes the correlation distance between vectors u and v. This is

.. math::

1 - \frac(u - \bar{u) \cdot (v - \bar

})}
               {{ ||(u - \bar{u})|| }_2 { ||(v - \bar{v})|| }_2}

   where :math:`\bar{v}` is the mean of the elements of vector v,
   and :math:`x \cdot y` is the dot product of :math:`x` and :math:`y`.


8. ``Y = cdist(XA, XB, 'hamming')``

   Computes the normalized Hamming distance, or the proportion of
   those vector elements between two n-vectors ``u`` and ``v``
   which disagree. To save memory, the matrix ``X`` can be of type
   boolean.

9. ``Y = cdist(XA, XB, 'jaccard')``

   Computes the Jaccard distance between the points. Given two
   vectors, ``u`` and ``v``, the Jaccard distance is the
   proportion of those elements ``u[i]`` and ``v[i]`` that
   disagree where at least one of them is non-zero.

10. ``Y = cdist(XA, XB, 'chebyshev')``

   Computes the Chebyshev distance between the points. The
   Chebyshev distance between two n-vectors ``u`` and ``v`` is the
   maximum norm-1 distance between their respective elements. More
   precisely, the distance is given by

   .. math::

      d(u,v) = \max_i { |u_i-v_i| }.

11. ``Y = cdist(XA, XB, 'canberra')``

   Computes the Canberra distance between the points. The
   Canberra distance between two points ``u`` and ``v`` is

   .. math::

     d(u,v) = \sum_i \frac{ |u_i-v_i| }
                          { |u_i|+|v_i| }.

12. ``Y = cdist(XA, XB, 'braycurtis')``

   Computes the Bray-Curtis distance between the points. The
   Bray-Curtis distance between two points ``u`` and ``v`` is


   .. math::

        d(u,v) = \frac{\sum_i (|u_i-v_i|)}
                      {\sum_i (|u_i+v_i|)}

13. ``Y = cdist(XA, XB, 'mahalanobis', VI=None)``

   Computes the Mahalanobis distance between the points. The
   Mahalanobis distance between two points ``u`` and ``v`` is
   :math:`\sqrt{(u-v)(1/V)(u-v)^T}` where :math:`(1/V)` (the ``VI``
   variable) is the inverse covariance. If ``VI`` is not None,
   ``VI`` will be used as the inverse covariance matrix.

14. ``Y = cdist(XA, XB, 'yule')``

   Computes the Yule distance between the boolean
   vectors. (see `yule` function documentation)

15. ``Y = cdist(XA, XB, 'matching')``

   Synonym for 'hamming'.

16. ``Y = cdist(XA, XB, 'dice')``

   Computes the Dice distance between the boolean vectors. (see
   `dice` function documentation)

17. ``Y = cdist(XA, XB, 'kulsinski')``

   Computes the Kulsinski distance between the boolean
   vectors. (see `kulsinski` function documentation)

18. ``Y = cdist(XA, XB, 'rogerstanimoto')``

   Computes the Rogers-Tanimoto distance between the boolean
   vectors. (see `rogerstanimoto` function documentation)

19. ``Y = cdist(XA, XB, 'russellrao')``

   Computes the Russell-Rao distance between the boolean
   vectors. (see `russellrao` function documentation)

20. ``Y = cdist(XA, XB, 'sokalmichener')``

   Computes the Sokal-Michener distance between the boolean
   vectors. (see `sokalmichener` function documentation)

21. ``Y = cdist(XA, XB, 'sokalsneath')``

   Computes the Sokal-Sneath distance between the vectors. (see
   `sokalsneath` function documentation)


22. ``Y = cdist(XA, XB, 'wminkowski', p=2., w=w)``

   Computes the weighted Minkowski distance between the
   vectors. (see `wminkowski` function documentation)

23. ``Y = cdist(XA, XB, f)``

   Computes the distance between all pairs of vectors in X
   using the user supplied 2-arity function f. For example,
   Euclidean distance between the vectors could be computed
   as follows::

     dm = cdist(XA, XB, lambda u, v: np.sqrt(((u-v)**2).sum()))

   Note that you should avoid passing a reference to one of
   the distance functions defined in this library. For example,::

     dm = cdist(XA, XB, sokalsneath)

   would calculate the pair-wise distances between the vectors in
   X using the Python function `sokalsneath`. This would result in
   sokalsneath being called :math:`{n \choose 2}` times, which
   is inefficient. Instead, the optimized C version is more
   efficient, and we call it using the following syntax::

     dm = cdist(XA, XB, 'sokalsneath')

Examples
--------
Find the Euclidean distances between four 2-D coordinates:

>>> from scipy.spatial import distance
>>> coords = [(35.0456, -85.2672),
...           (35.1174, -89.9711),
...           (35.9728, -83.9422),
...           (36.1667, -86.7833)]
>>> distance.cdist(coords, coords, 'euclidean')
array([[ 0.    ,  4.7044,  1.6172,  1.8856],
       [ 4.7044,  0.    ,  6.0893,  3.3561],
       [ 1.6172,  6.0893,  0.    ,  2.8477],
       [ 1.8856,  3.3561,  2.8477,  0.    ]])


Find the Manhattan distance from a 3-D point to the corners of the unit
cube:

>>> a = np.array([[0, 0, 0],
...               [0, 0, 1],
...               [0, 1, 0],
...               [0, 1, 1],
...               [1, 0, 0],
...               [1, 0, 1],
...               [1, 1, 0],
...               [1, 1, 1]])
>>> b = np.array([[ 0.1,  0.2,  0.4]])
>>> distance.cdist(a, b, 'cityblock')
array([[ 0.7],
       [ 0.9],
       [ 1.3],
       [ 1.5],
       [ 1.5],
       [ 1.7],
       [ 2.1],
       [ 2.3]])

val check_random_state : Py.Object.t -> Py.Object.t

Turn seed into a np.random.RandomState instance

If seed is None (or np.random), return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. If seed is a new-style np.random.Generator, return it. Otherwise, raise ValueError.

val chisquare : 
  ?f_exp:[> `Ndarray ] Np.Obj.t ->
  ?ddof:int ->
  ?axis:[ `I of int | `None ] ->
  f_obs:[> `Ndarray ] Np.Obj.t ->
  unit ->
  Py.Object.t * Py.Object.t

Calculate a one-way chi-square test.

The chi-square test tests the null hypothesis that the categorical data has the given frequencies.

Parameters ---------- f_obs : array_like Observed frequencies in each category. f_exp : array_like, optional Expected frequencies in each category. By default the categories are assumed to be equally likely. ddof : int, optional 'Delta degrees of freedom': adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with ``k - 1 - ddof`` degrees of freedom, where `k` is the number of observed frequencies. The default value of `ddof` is 0. axis : int or None, optional The axis of the broadcast result of `f_obs` and `f_exp` along which to apply the test. If axis is None, all values in `f_obs` are treated as a single data set. Default is 0.

Returns ------- chisq : float or ndarray The chi-squared test statistic. The value is a float if `axis` is None or `f_obs` and `f_exp` are 1-D. p : float or ndarray The p-value of the test. The value is a float if `ddof` and the return value `chisq` are scalars.

See Also -------- scipy.stats.power_divergence

Notes ----- This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5.

The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not chi-square, in which case this test is not appropriate.

References ---------- .. 1 Lowry, Richard. 'Concepts and Applications of Inferential Statistics'. Chapter 8. https://web.archive.org/web/20171022032306/http://vassarstats.net:80/textbook/ch8pt1.html .. 2 'Chi-squared test', https://en.wikipedia.org/wiki/Chi-squared_test

Examples -------- When just `f_obs` is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies.

>>> from scipy.stats import chisquare >>> chisquare(16, 18, 16, 14, 12, 12) (2.0, 0.84914503608460956)

With `f_exp` the expected frequencies can be given.

>>> chisquare(16, 18, 16, 14, 12, 12, f_exp=16, 16, 16, 16, 16, 8) (3.5, 0.62338762774958223)

When `f_obs` is 2-D, by default the test is applied to each column.

>>> obs = np.array([16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]).T >>> obs.shape (6, 2) >>> chisquare(obs) (array( 2. , 6.66666667), array( 0.84914504, 0.24663415))

By setting ``axis=None``, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array.

>>> chisquare(obs, axis=None) (23.31034482758621, 0.015975692534127565) >>> chisquare(obs.ravel()) (23.31034482758621, 0.015975692534127565)

`ddof` is the change to make to the default degrees of freedom.

>>> chisquare(16, 18, 16, 14, 12, 12, ddof=1) (2.0, 0.73575888234288467)

The calculation of the p-values is done by broadcasting the chi-squared statistic with `ddof`.

>>> chisquare(16, 18, 16, 14, 12, 12, ddof=0,1,2) (2.0, array( 0.84914504, 0.73575888, 0.5724067 ))

`f_obs` and `f_exp` are also broadcast. In the following, `f_obs` has shape (6,) and `f_exp` has shape (2, 6), so the result of broadcasting `f_obs` and `f_exp` has shape (2, 6). To compute the desired chi-squared statistics, we use ``axis=1``:

>>> chisquare(16, 18, 16, 14, 12, 12, ... f_exp=[16, 16, 16, 16, 16, 8], [8, 20, 20, 16, 12, 12], ... axis=1) (array( 3.5 , 9.25), array( 0.62338763, 0.09949846))

val combine_pvalues : 
  ?method_:[ `Fisher | `Pearson | `Tippett | `Stouffer | `Mudholkar_george ] ->
  ?weights:[ `Ndarray of [> `Ndarray ] Np.Obj.t | `T1_D of Py.Object.t ] ->
  pvalues:[ `Ndarray of [> `Ndarray ] Np.Obj.t | `T1_D of Py.Object.t ] ->
  unit ->
  float * float

Combine p-values from independent tests bearing upon the same hypothesis.

Parameters ---------- pvalues : array_like, 1-D Array of p-values assumed to come from independent tests. method : 'fisher', 'pearson', 'tippett', 'stouffer', 'mudholkar_george', optional Name of method to use to combine p-values. The following methods are available (default is 'fisher'):

* 'fisher': Fisher's method (Fisher's combined probability test), the sum of the logarithm of the p-values * 'pearson': Pearson's method (similar to Fisher's but uses sum of the complement of the p-values inside the logarithms) * 'tippett': Tippett's method (minimum of p-values) * 'stouffer': Stouffer's Z-score method * 'mudholkar_george': the difference of Fisher's and Pearson's methods divided by 2 weights : array_like, 1-D, optional Optional array of weights used only for Stouffer's Z-score method.

Returns ------- statistic: float The statistic calculated by the specified method. pval: float The combined p-value.

Notes ----- Fisher's method (also known as Fisher's combined probability test) 1_ uses a chi-squared statistic to compute a combined p-value. The closely related Stouffer's Z-score method 2_ uses Z-scores rather than p-values. The advantage of Stouffer's method is that it is straightforward to introduce weights, which can make Stouffer's method more powerful than Fisher's method when the p-values are from studies of different size 6_ 7_. The Pearson's method uses :math:`log(1-p_i)` inside the sum whereas Fisher's method uses :math:`log(p_i)` 4_. For Fisher's and Pearson's method, the sum of the logarithms is multiplied by -2 in the implementation. This quantity has a chi-square distribution that determines the p-value. The `mudholkar_george` method is the difference of the Fisher's and Pearson's test statistics, each of which include the -2 factor 4_. However, the `mudholkar_george` method does not include these -2 factors. The test statistic of `mudholkar_george` is the sum of logisitic random variables and equation 3.6 in 3_ is used to approximate the p-value based on Student's t-distribution.

Fisher's method may be extended to combine p-values from dependent tests 5_. Extensions such as Brown's method and Kost's method are not currently implemented.

.. versionadded:: 0.15.0

References ---------- .. 1 https://en.wikipedia.org/wiki/Fisher%27s_method .. 2 https://en.wikipedia.org/wiki/Fisher%27s_method#Relation_to_Stouffer.27s_Z-score_method .. 3 George, E. O., and G. S. Mudholkar. 'On the convolution of logistic random variables.' Metrika 30.1 (1983): 1-13. .. 4 Heard, N. and Rubin-Delanchey, P. 'Choosing between methods of combining p-values.' Biometrika 105.1 (2018): 239-246. .. 5 Whitlock, M. C. 'Combining probability from independent tests: the weighted Z-method is superior to Fisher's approach.' Journal of Evolutionary Biology 18, no. 5 (2005): 1368-1373. .. 6 Zaykin, Dmitri V. 'Optimally weighted Z-test is a powerful method for combining probabilities in meta-analysis.' Journal of Evolutionary Biology 24, no. 8 (2011): 1836-1841. .. 7 https://en.wikipedia.org/wiki/Extensions_of_Fisher%27s_method

val cumfreq : 
  ?numbins:int ->
  ?defaultreallimits:Py.Object.t ->
  ?weights:[> `Ndarray ] Np.Obj.t ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t * float * float * int

Return a cumulative frequency histogram, using the histogram function.

A cumulative histogram is a mapping that counts the cumulative number of observations in all of the bins up to the specified bin.

Parameters ---------- a : array_like Input array. numbins : int, optional The number of bins to use for the histogram. Default is 10. defaultreallimits : tuple (lower, upper), optional The lower and upper values for the range of the histogram. If no value is given, a range slightly larger than the range of the values in `a` is used. Specifically ``(a.min() - s, a.max() + s)``, where ``s = (1/2)(a.max() - a.min()) / (numbins - 1)``. weights : array_like, optional The weights for each value in `a`. Default is None, which gives each value a weight of 1.0

Returns ------- cumcount : ndarray Binned values of cumulative frequency. lowerlimit : float Lower real limit binsize : float Width of each bin. extrapoints : int Extra points.

Examples -------- >>> import matplotlib.pyplot as plt >>> from scipy import stats >>> x = 1, 4, 2, 1, 3, 1 >>> res = stats.cumfreq(x, numbins=4, defaultreallimits=(1.5, 5)) >>> res.cumcount array( 1., 2., 3., 3.) >>> res.extrapoints 3

Create a normal distribution with 1000 random values

>>> rng = np.random.RandomState(seed=12345) >>> samples = stats.norm.rvs(size=1000, random_state=rng)

Calculate cumulative frequencies

>>> res = stats.cumfreq(samples, numbins=25)

Calculate space of values for x

>>> x = res.lowerlimit + np.linspace(0, res.binsize*res.cumcount.size, ... res.cumcount.size)

Plot histogram and cumulative histogram

>>> fig = plt.figure(figsize=(10, 4)) >>> ax1 = fig.add_subplot(1, 2, 1) >>> ax2 = fig.add_subplot(1, 2, 2) >>> ax1.hist(samples, bins=25) >>> ax1.set_title('Histogram') >>> ax2.bar(x, res.cumcount, width=res.binsize) >>> ax2.set_title('Cumulative histogram') >>> ax2.set_xlim(x.min(), x.max())

>>> plt.show()

val describe : 
  ?axis:[ `I of int | `None ] ->
  ?ddof:int ->
  ?bias:bool ->
  ?nan_policy:[ `Propagate | `Raise | `Omit ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  Py.Object.t
  * Py.Object.t
  * Py.Object.t
  * Py.Object.t
  * Py.Object.t
  * Py.Object.t

Compute several descriptive statistics of the passed array.

Parameters ---------- a : array_like Input data. axis : int or None, optional Axis along which statistics are calculated. Default is 0. If None, compute over the whole array `a`. ddof : int, optional Delta degrees of freedom (only for variance). Default is 1. bias : bool, optional If False, then the skewness and kurtosis calculations are corrected for statistical bias. nan_policy : 'propagate', 'raise', 'omit', optional Defines how to handle when input contains nan. The following options are available (default is 'propagate'):

* 'propagate': returns nan * 'raise': throws an error * 'omit': performs the calculations ignoring nan values

Returns ------- nobs : int or ndarray of ints Number of observations (length of data along `axis`). When 'omit' is chosen as nan_policy, each column is counted separately. minmax: tuple of ndarrays or floats Minimum and maximum value of data array. mean : ndarray or float Arithmetic mean of data along axis. variance : ndarray or float Unbiased variance of the data along axis, denominator is number of observations minus one. skewness : ndarray or float Skewness, based on moment calculations with denominator equal to the number of observations, i.e. no degrees of freedom correction. kurtosis : ndarray or float Kurtosis (Fisher). The kurtosis is normalized so that it is zero for the normal distribution. No degrees of freedom are used.

See Also -------- chisquare

Notes ----- This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5.

When `lambda_` is less than zero, the formula for the statistic involves dividing by `f_obs`, so a warning or error may be generated if any value in `f_obs` is 0.

Similarly, a warning or error may be generated if any value in `f_exp` is zero when `lambda_` >= 0.

The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate.

This function handles masked arrays. If an element of `f_obs` or `f_exp` is masked, then data at that position is ignored, and does not count towards the size of the data set.

.. versionadded:: 0.13.0

References ---------- .. 1 Lowry, Richard. 'Concepts and Applications of Inferential Statistics'. Chapter 8. https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html .. 2 'Chi-squared test', https://en.wikipedia.org/wiki/Chi-squared_test .. 3 'G-test', https://en.wikipedia.org/wiki/G-test .. 4 Sokal, R. R. and Rohlf, F. J. 'Biometry: the principles and practice of statistics in biological research', New York: Freeman (1981) .. 5 Cressie, N. and Read, T. R. C., 'Multinomial Goodness-of-Fit Tests', J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984), pp. 440-464.

Examples -------- (See `chisquare` for more examples.)

When just `f_obs` is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies. Here we perform a G-test (i.e. use the log-likelihood ratio statistic):

>>> from scipy.stats import power_divergence >>> power_divergence(16, 18, 16, 14, 12, 12, lambda_='log-likelihood') (2.006573162632538, 0.84823476779463769)

The expected frequencies can be given with the `f_exp` argument:

>>> power_divergence(16, 18, 16, 14, 12, 12, ... f_exp=16, 16, 16, 16, 16, 8, ... lambda_='log-likelihood') (3.3281031458963746, 0.6495419288047497)

When `f_obs` is 2-D, by default the test is applied to each column.

>>> obs = np.array([16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]).T >>> obs.shape (6, 2) >>> power_divergence(obs, lambda_='log-likelihood') (array( 2.00657316, 6.77634498), array( 0.84823477, 0.23781225))

By setting ``axis=None``, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array.

>>> power_divergence(obs, axis=None) (23.31034482758621, 0.015975692534127565) >>> power_divergence(obs.ravel()) (23.31034482758621, 0.015975692534127565)

`ddof` is the change to make to the default degrees of freedom.

>>> power_divergence(16, 18, 16, 14, 12, 12, ddof=1) (2.0, 0.73575888234288467)

The calculation of the p-values is done by broadcasting the test statistic with `ddof`.

>>> power_divergence(16, 18, 16, 14, 12, 12, ddof=0,1,2) (2.0, array( 0.84914504, 0.73575888, 0.5724067 ))

>>> power_divergence(16, 18, 16, 14, 12, 12, ... f_exp=[16, 16, 16, 16, 16, 8], ... [8, 20, 20, 16, 12, 12], ... axis=1) (array( 3.5 , 9.25), array( 0.62338763, 0.09949846))

val rankdata : 
  ?method_:[ `Average | `Min | `Max | `Dense | `Ordinal ] ->
  ?axis:int ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t

Assign ranks to data, dealing with ties appropriately.

By default (``axis=None``), the data array is first flattened, and a flat array of ranks is returned. Separately reshape the rank array to the shape of the data array if desired (see Examples).

Ranks begin at 1. The `method` argument controls how ranks are assigned to equal values. See 1_ for further discussion of ranking methods.

Parameters ---------- a : array_like The array of values to be ranked. method : 'average', 'min', 'max', 'dense', 'ordinal', optional The method used to assign ranks to tied elements. The following methods are available (default is 'average'):

* 'average': The average of the ranks that would have been assigned to all the tied values is assigned to each value. * 'min': The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as 'competition' ranking.) * 'max': The maximum of the ranks that would have been assigned to all the tied values is assigned to each value. * 'dense': Like 'min', but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements. * 'ordinal': All values are given a distinct rank, corresponding to the order that the values occur in `a`. axis : None, int, optional Axis along which to perform the ranking. If ``None``, the data array is first flattened.

Returns ------- ranks : ndarray An array of size equal to the size of `a`, containing rank scores.

References ---------- .. 1 'Ranking', https://en.wikipedia.org/wiki/Ranking

Examples -------- >>> from scipy.stats import rankdata >>> rankdata(0, 2, 3, 2) array( 1. , 2.5, 4. , 2.5) >>> rankdata(0, 2, 3, 2, method='min') array( 1, 2, 4, 2) >>> rankdata(0, 2, 3, 2, method='max') array( 1, 3, 4, 3) >>> rankdata(0, 2, 3, 2, method='dense') array( 1, 2, 3, 2) >>> rankdata(0, 2, 3, 2, method='ordinal') array( 1, 2, 4, 3) >>> rankdata([0, 2], [3, 2]).reshape(2,2) array([1. , 2.5], [4. , 2.5]) >>> rankdata([0, 2, 2], [3, 2, 5], axis=1) array([1. , 2.5, 2.5], [2. , 1. , 3. ])

val ranksums : x:Py.Object.t -> y:Py.Object.t -> unit -> float * float

Compute the Wilcoxon rank-sum statistic for two samples.

The Wilcoxon rank-sum test tests the null hypothesis that two sets of measurements are drawn from the same distribution. The alternative hypothesis is that values in one sample are more likely to be larger than the values in the other sample.

This test should be used to compare two samples from continuous distributions. It does not handle ties between measurements in x and y. For tie-handling and an optional continuity correction see `scipy.stats.mannwhitneyu`.

Parameters ---------- x,y : array_like The data from the two samples.

Returns ------- statistic : float The test statistic under the large-sample approximation that the rank sum statistic is normally distributed. pvalue : float The two-sided p-value of the test.

References ---------- .. 1 https://en.wikipedia.org/wiki/Wilcoxon_rank-sum_test

Examples -------- We can test the hypothesis that two independent unequal-sized samples are drawn from the same distribution with computing the Wilcoxon rank-sum statistic.

>>> from scipy.stats import ranksums >>> sample1 = np.random.uniform(-1, 1, 200) >>> sample2 = np.random.uniform(-0.5, 1.5, 300) # a shifted distribution >>> ranksums(sample1, sample2) RanksumsResult(statistic=-7.887059, pvalue=3.09390448e-15) # may vary

The p-value of less than ``0.05`` indicates that this test rejects the hypothesis at the 5% significance level.

val relfreq : 
  ?numbins:int ->
  ?defaultreallimits:Py.Object.t ->
  ?weights:[> `Ndarray ] Np.Obj.t ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t * float * float * int

Return a relative frequency histogram, using the histogram function.

A relative frequency histogram is a mapping of the number of observations in each of the bins relative to the total of observations.

Parameters ---------- a : array_like Input array. numbins : int, optional The number of bins to use for the histogram. Default is 10. defaultreallimits : tuple (lower, upper), optional The lower and upper values for the range of the histogram. If no value is given, a range slightly larger than the range of the values in a is used. Specifically ``(a.min() - s, a.max() + s)``, where ``s = (1/2)(a.max() - a.min()) / (numbins - 1)``. weights : array_like, optional The weights for each value in `a`. Default is None, which gives each value a weight of 1.0

Returns ------- frequency : ndarray Binned values of relative frequency. lowerlimit : float Lower real limit. binsize : float Width of each bin. extrapoints : int Extra points.

Examples -------- >>> import matplotlib.pyplot as plt >>> from scipy import stats >>> a = np.array(2, 4, 1, 2, 3, 2) >>> res = stats.relfreq(a, numbins=4) >>> res.frequency array( 0.16666667, 0.5 , 0.16666667, 0.16666667) >>> np.sum(res.frequency) # relative frequencies should add up to 1 1.0

Create a normal distribution with 1000 random values

>>> rng = np.random.RandomState(seed=12345) >>> samples = stats.norm.rvs(size=1000, random_state=rng)

Calculate relative frequencies

>>> res = stats.relfreq(samples, numbins=25)

Calculate space of values for x

>>> x = res.lowerlimit + np.linspace(0, res.binsize*res.frequency.size, ... res.frequency.size)

Plot relative frequency histogram

>>> fig = plt.figure(figsize=(5, 4)) >>> ax = fig.add_subplot(1, 1, 1) >>> ax.bar(x, res.frequency, width=res.binsize) >>> ax.set_title('Relative frequency histogram') >>> ax.set_xlim(x.min(), x.max())

>>> plt.show()

val rng_integers : 
  ?high:[ `I of int | `Array_like_of_ints of Py.Object.t ] ->
  ?size:Py.Object.t ->
  ?dtype:[ `S of string | `Dtype of Np.Dtype.t ] ->
  ?endpoint:bool ->
  gen:[ `PyObject of Py.Object.t | `None ] ->
  low:[ `I of int | `Array_like_of_ints of Py.Object.t ] ->
  unit ->
  Py.Object.t

Return random integers from low (inclusive) to high (exclusive), or if endpoint=True, low (inclusive) to high (inclusive). Replaces `RandomState.randint` (with endpoint=False) and `RandomState.random_integers` (with endpoint=True).

Return random integers from the 'discrete uniform' distribution of the specified dtype. If high is None (the default), then results are from 0 to low.

Parameters ---------- gen: None, np.random.RandomState, np.random.Generator Random number generator. If None, then the np.random.RandomState singleton is used. low: int or array-like of ints Lowest (signed) integers to be drawn from the distribution (unless high=None, in which case this parameter is 0 and this value is used for high). high: int or array-like of ints If provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if high=None). If array-like, must contain integer values. size: None Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. dtype: str, dtype, optional Desired dtype of the result. All dtypes are determined by their name, i.e., 'int64', 'int', etc, so byteorder is not available and a specific precision may have different C types depending on the platform. The default value is np.int_. endpoint: bool, optional If True, sample from the interval low, high instead of the default low, high) Defaults to False. Returns ------- out: int or ndarray of ints size-shaped array of random integers from the appropriate distribution, or a single such random int if size not provided.

val rvs_ratio_uniforms : 
  ?size:int list ->
  ?c:float ->
  ?random_state:[ `I of int | `PyObject of Py.Object.t ] ->
  pdf:Py.Object.t ->
  umax:float ->
  vmin:float ->
  vmax:float ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t

Generate random samples from a probability density function using the ratio-of-uniforms method.

Parameters ---------- pdf : callable A function with signature `pdf(x)` that is proportional to the probability density function of the distribution. umax : float The upper bound of the bounding rectangle in the u-direction. vmin : float The lower bound of the bounding rectangle in the v-direction. vmax : float The upper bound of the bounding rectangle in the v-direction. size : int or tuple of ints, optional Defining number of random variates (default is 1). c : float, optional. Shift parameter of ratio-of-uniforms method, see Notes. Default is 0. random_state : None, int, `~np.random.RandomState`, `~np.random.Generator`, optional If `random_state` is `None` the `~np.random.RandomState` singleton is used. If `random_state` is an int, a new ``RandomState`` instance is used, seeded with random_state. If `random_state` is already a ``RandomState`` or ``Generator`` instance, then that object is used. Default is None.

Returns ------- rvs : ndarray The random variates distributed according to the probability distribution defined by the pdf.

Notes ----- Given a univariate probability density function `pdf` and a constant `c`, define the set ``A = (u, v) : 0 < u <= sqrt(pdf(v/u + c))``. If `(U, V)` is a random vector uniformly distributed over `A`, then `V/U + c` follows a distribution according to `pdf`.

The above result (see 1_, 2_) can be used to sample random variables using only the pdf, i.e. no inversion of the cdf is required. Typical choices of `c` are zero or the mode of `pdf`. The set `A` is a subset of the rectangle ``R = 0, umax x vmin, vmax`` where

``umax = sup sqrt(pdf(x))``
``vmin = inf (x - c) sqrt(pdf(x))``
``vmax = sup (x - c) sqrt(pdf(x))``

In particular, these values are finite if `pdf` is bounded and ``x**2 * pdf(x)`` is bounded (i.e. subquadratic tails). One can generate `(U, V)` uniformly on `R` and return `V/U + c` if `(U, V)` are also in `A` which can be directly verified.

The algorithm is not changed if one replaces `pdf` by k * `pdf` for any constant k > 0. Thus, it is often convenient to work with a function that is proportional to the probability density function by dropping unneccessary normalization factors.

Intuitively, the method works well if `A` fills up most of the enclosing rectangle such that the probability is high that `(U, V)` lies in `A` whenever it lies in `R` as the number of required iterations becomes too large otherwise. To be more precise, note that the expected number of iterations to draw `(U, V)` uniformly distributed on `R` such that `(U, V)` is also in `A` is given by the ratio ``area(R) / area(A) = 2 * umax * (vmax - vmin) / area(pdf)``, where `area(pdf)` is the integral of `pdf` (which is equal to one if the probability density function is used but can take on other values if a function proportional to the density is used). The equality holds since the area of `A` is equal to 0.5 * area(pdf) (Theorem 7.1 in 1_). If the sampling fails to generate a single random variate after 50000 iterations (i.e. not a single draw is in `A`), an exception is raised.

If the bounding rectangle is not correctly specified (i.e. if it does not contain `A`), the algorithm samples from a distribution different from the one given by `pdf`. It is therefore recommended to perform a test such as `~scipy.stats.kstest` as a check.

References ---------- .. 1 L. Devroye, 'Non-Uniform Random Variate Generation', Springer-Verlag, 1986.

.. 2 W. Hoermann and J. Leydold, 'Generating generalized inverse Gaussian random variates', Statistics and Computing, 24(4), p. 547--557, 2014.

.. 3 A.J. Kinderman and J.F. Monahan, 'Computer Generation of Random Variables Using the Ratio of Uniform Deviates', ACM Transactions on Mathematical Software, 3(3), p. 257--260, 1977.

Examples -------- >>> from scipy import stats

Simulate normally distributed random variables. It is easy to compute the bounding rectangle explicitly in that case. For simplicity, we drop the normalization factor of the density.

>>> f = lambda x: np.exp(-x**2 / 2) >>> v_bound = np.sqrt(f(np.sqrt(2))) * np.sqrt(2) >>> umax, vmin, vmax = np.sqrt(f(0)), -v_bound, v_bound >>> np.random.seed(12345) >>> rvs = stats.rvs_ratio_uniforms(f, umax, vmin, vmax, size=2500)

The K-S test confirms that the random variates are indeed normally distributed (normality is not rejected at 5% significance level):

>>> stats.kstest(rvs, 'norm')1 0.33783681428365553

The exponential distribution provides another example where the bounding rectangle can be determined explicitly.

>>> np.random.seed(12345) >>> rvs = stats.rvs_ratio_uniforms(lambda x: np.exp(-x), umax=1, ... vmin=0, vmax=2*np.exp(-1), size=1000) >>> stats.kstest(rvs, 'expon')1 0.928454552559516

val scoreatpercentile : 
  ?limit:Py.Object.t ->
  ?interpolation_method:[ `Fraction | `Lower | `Higher ] ->
  ?axis:int ->
  a:[> `Ndarray ] Np.Obj.t ->
  per:[> `Ndarray ] Np.Obj.t ->
  unit ->
  Py.Object.t

Calculate the score at a given percentile of the input sequence.

For example, the score at `per=50` is the median. If the desired quantile lies between two data points, we interpolate between them, according to the value of `interpolation`. If the parameter `limit` is provided, it should be a tuple (lower, upper) of two values.

Parameters ---------- a : array_like A 1-D array of values from which to extract score. per : array_like Percentile(s) at which to extract score. Values should be in range 0,100. limit : tuple, optional Tuple of two scalars, the lower and upper limits within which to compute the percentile. Values of `a` outside this (closed) interval will be ignored. interpolation_method : 'fraction', 'lower', 'higher', optional Specifies the interpolation method to use, when the desired quantile lies between two data points `i` and `j` The following options are available (default is 'fraction'):

* 'fraction': ``i + (j - i) * fraction`` where ``fraction`` is the fractional part of the index surrounded by ``i`` and ``j`` * 'lower': ``i`` * 'higher': ``j``

axis : int, optional Axis along which the percentiles are computed. Default is None. If None, compute over the whole array `a`.

Returns ------- score : float or ndarray Score at percentile(s).

See Also -------- percentileofscore, numpy.percentile

Notes ----- This function will become obsolete in the future. For NumPy 1.9 and higher, `numpy.percentile` provides all the functionality that `scoreatpercentile` provides. And it's significantly faster. Therefore it's recommended to use `numpy.percentile` for users that have numpy >= 1.9.

Examples -------- >>> from scipy import stats >>> a = np.arange(100) >>> stats.scoreatpercentile(a, 50) 49.5

val sem : 
  ?axis:[ `I of int | `None ] ->
  ?ddof:int ->
  ?nan_policy:[ `Propagate | `Raise | `Omit ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  Py.Object.t

Compute standard error of the mean.

Calculate the standard error of the mean (or standard error of measurement) of the values in the input array.

Parameters ---------- a : array_like An array containing the values for which the standard error is returned. axis : int or None, optional Axis along which to operate. Default is 0. If None, compute over the whole array `a`. ddof : int, optional Delta degrees-of-freedom. How many degrees of freedom to adjust for bias in limited samples relative to the population estimate of variance. Defaults to 1. nan_policy : 'propagate', 'raise', 'omit', optional Defines how to handle when input contains nan. The following options are available (default is 'propagate'):

* 'propagate': returns nan * 'raise': throws an error * 'omit': performs the calculations ignoring nan values

Returns ------- s : ndarray or float The standard error of the mean in the sample(s), along the input axis.

Notes ----- The default value for `ddof` is different to the default (0) used by other ddof containing routines, such as np.std and np.nanstd.

Examples -------- Find standard error along the first axis:

>>> from scipy import stats >>> a = np.arange(20).reshape(5,4) >>> stats.sem(a) array( 2.8284, 2.8284, 2.8284, 2.8284)

Find standard error across the whole array, using n degrees of freedom:

>>> stats.sem(a, axis=None, ddof=0) 1.2893796958227628

val siegelslopes : 
  ?x:[> `Ndarray ] Np.Obj.t ->
  ?method_:[ `Hierarchical | `Separate ] ->
  y:[> `Ndarray ] Np.Obj.t ->
  unit ->
  float * float

Computes the Siegel estimator for a set of points (x, y).

`siegelslopes` implements a method for robust linear regression using repeated medians (see 1_) to fit a line to the points (x, y). The method is robust to outliers with an asymptotic breakdown point of 50%.

Parameters ---------- y : array_like Dependent variable. x : array_like or None, optional Independent variable. If None, use ``arange(len(y))`` instead. method : 'hierarchical', 'separate' If 'hierarchical', estimate the intercept using the estimated slope ``medslope`` (default option). If 'separate', estimate the intercept independent of the estimated slope. See Notes for details.

Returns ------- medslope : float Estimate of the slope of the regression line. medintercept : float Estimate of the intercept of the regression line.

See also -------- theilslopes : a similar technique without repeated medians

Notes ----- With ``n = len(y)``, compute ``m_j`` as the median of the slopes from the point ``(xj, yj)`` to all other `n-1` points. ``medslope`` is then the median of all slopes ``m_j``. Two ways are given to estimate the intercept in 1_ which can be chosen via the parameter ``method``. The hierarchical approach uses the estimated slope ``medslope`` and computes ``medintercept`` as the median of ``y - medslope*x``. The other approach estimates the intercept separately as follows: for each point ``(xj, yj)``, compute the intercepts of all the `n-1` lines through the remaining points and take the median ``i_j``. ``medintercept`` is the median of the ``i_j``.

The implementation computes `n` times the median of a vector of size `n` which can be slow for large vectors. There are more efficient algorithms (see 2_) which are not implemented here.

References ---------- .. 1 A. Siegel, 'Robust Regression Using Repeated Medians', Biometrika, Vol. 69, pp. 242-244, 1982.

.. 2 A. Stein and M. Werman, 'Finding the repeated median regression line', Proceedings of the Third Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 409-413, 1992.

Examples -------- >>> from scipy import stats >>> import matplotlib.pyplot as plt

>>> x = np.linspace(-5, 5, num=150) >>> y = x + np.random.normal(size=x.size) >>> y11:15 += 10 # add outliers >>> y-5: -= 7

Compute the slope and intercept. For comparison, also compute the least-squares fit with `linregress`:

>>> res = stats.siegelslopes(y, x) >>> lsq_res = stats.linregress(x, y)

Plot the results. The Siegel regression line is shown in red. The green line shows the least-squares fit for comparison.

>>> fig = plt.figure() >>> ax = fig.add_subplot(111) >>> ax.plot(x, y, 'b.') >>> ax.plot(x, res1 + res0 * x, 'r-') >>> ax.plot(x, lsq_res1 + lsq_res0 * x, 'g-') >>> plt.show()

val sigmaclip : 
  ?low:float ->
  ?high:float ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t * float * float

Perform iterative sigma-clipping of array elements.

Starting from the full sample, all elements outside the critical range are removed, i.e. all elements of the input array `c` that satisfy either of the following conditions::

c < mean(c) - std(c)*low c > mean(c) + std(c)*high

The iteration continues with the updated sample until no elements are outside the (updated) range.

Parameters ---------- a : array_like Data array, will be raveled if not 1-D. low : float, optional Lower bound factor of sigma clipping. Default is 4. high : float, optional Upper bound factor of sigma clipping. Default is 4.

Returns ------- clipped : ndarray Input array with clipped elements removed. lower : float Lower threshold value use for clipping. upper : float Upper threshold value use for clipping.

Examples -------- >>> from scipy.stats import sigmaclip >>> a = np.concatenate((np.linspace(9.5, 10.5, 31), ... np.linspace(0, 20, 5))) >>> fact = 1.5 >>> c, low, upp = sigmaclip(a, fact, fact) >>> c array( 9.96666667, 10. , 10.03333333, 10. ) >>> c.var(), c.std() (0.00055555555555555165, 0.023570226039551501) >>> low, c.mean() - fact*c.std(), c.min() (9.9646446609406727, 9.9646446609406727, 9.9666666666666668) >>> upp, c.mean() + fact*c.std(), c.max() (10.035355339059327, 10.035355339059327, 10.033333333333333)

>>> a = np.concatenate((np.linspace(9.5, 10.5, 11), ... np.linspace(-100, -50, 3))) >>> c, low, upp = sigmaclip(a, 1.8, 1.8) >>> (c == np.linspace(9.5, 10.5, 11)).all() True

val skew : 
  ?axis:[ `I of int | `None ] ->
  ?bias:bool ->
  ?nan_policy:[ `Propagate | `Raise | `Omit ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t

Compute the sample skewness of a data set.

For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function `skewtest` can be used to determine if the skewness value is close enough to zero, statistically speaking.

Parameters ---------- a : ndarray Input array. axis : int or None, optional Axis along which skewness is calculated. Default is 0. If None, compute over the whole array `a`. bias : bool, optional If False, then the calculations are corrected for statistical bias. nan_policy : 'propagate', 'raise', 'omit', optional Defines how to handle when input contains nan. The following options are available (default is 'propagate'):

* 'propagate': returns nan * 'raise': throws an error * 'omit': performs the calculations ignoring nan values

Returns ------- skewness : ndarray The skewness of values along an axis, returning 0 where all values are equal.

Notes ----- The sample skewness is computed as the Fisher-Pearson coefficient of skewness, i.e.

.. math::

g_1=\fracm_3m_2^{3/2

}

where

.. math::

m_i=\frac

N\sum_n=1^N(xn-\barx)^i

is the biased sample :math:`i\texttt

}

` central moment, and :math:`\barx` is the sample mean. If ``bias`` is False, the calculations are corrected for bias and the value computed is the adjusted Fisher-Pearson standardized moment coefficient, i.e.

.. math::

G_1=\frack_3k_2^{3/2

}

= \frac\sqrt{N(N-1)

}

N-2\fracm_3m_2^{3/2

}

References ---------- .. 1 Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000. Section 2.2.24.1

Examples -------- >>> from scipy.stats import skew >>> skew(1, 2, 3, 4, 5) 0.0 >>> skew(2, 8, 0, 4, 1, 9, 9, 0) 0.2650554122698573

val skewtest : 
  ?axis:[ `I of int | `None ] ->
  ?nan_policy:[ `Propagate | `Raise | `Omit ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  float * float

Test whether the skew is different from the normal distribution.

This function tests the null hypothesis that the skewness of the population that the sample was drawn from is the same as that of a corresponding normal distribution.

Parameters ---------- a : array The data to be tested. axis : int or None, optional Axis along which statistics are calculated. Default is 0. If None, compute over the whole array `a`. nan_policy : 'propagate', 'raise', 'omit', optional Defines how to handle when input contains nan. The following options are available (default is 'propagate'):

* 'propagate': returns nan * 'raise': throws an error * 'omit': performs the calculations ignoring nan values

Returns ------- statistic : float The computed z-score for this test. pvalue : float Two-sided p-value for the hypothesis test.

Notes ----- The sample size must be at least 8.

References ---------- .. 1 R. B. D'Agostino, A. J. Belanger and R. B. D'Agostino Jr., 'A suggestion for using powerful and informative tests of normality', American Statistician 44, pp. 316-321, 1990.

Examples -------- >>> from scipy.stats import skewtest >>> skewtest(1, 2, 3, 4, 5, 6, 7, 8) SkewtestResult(statistic=1.0108048609177787, pvalue=0.3121098361421897) >>> skewtest(2, 8, 0, 4, 1, 9, 9, 0) SkewtestResult(statistic=0.44626385374196975, pvalue=0.6554066631275459) >>> skewtest(1, 2, 3, 4, 5, 6, 7, 8000) SkewtestResult(statistic=3.571773510360407, pvalue=0.0003545719905823133) >>> skewtest(100, 100, 100, 100, 100, 100, 100, 101) SkewtestResult(statistic=3.5717766638478072, pvalue=0.000354567720281634)

val spearmanr : 
  ?b:Py.Object.t ->
  ?axis:[ `I of int | `None ] ->
  ?nan_policy:[ `Propagate | `Raise | `Omit ] ->
  a:Py.Object.t ->
  unit ->
  Py.Object.t * float

Calculate a Spearman correlation coefficient with associated p-value.

The Spearman rank-order correlation coefficient is a nonparametric measure of the monotonicity of the relationship between two datasets. Unlike the Pearson correlation, the Spearman correlation does not assume that both datasets are normally distributed. Like other correlation coefficients, this one varies between -1 and +1 with 0 implying no correlation. Correlations of -1 or +1 imply an exact monotonic relationship. Positive correlations imply that as x increases, so does y. Negative correlations imply that as x increases, y decreases.

The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Spearman correlation at least as extreme as the one computed from these datasets. The p-values are not entirely reliable but are probably reasonable for datasets larger than 500 or so.

Parameters ---------- a, b : 1D or 2D array_like, b is optional One or two 1-D or 2-D arrays containing multiple variables and observations. When these are 1-D, each represents a vector of observations of a single variable. For the behavior in the 2-D case, see under ``axis``, below. Both arrays need to have the same length in the ``axis`` dimension. axis : int or None, optional If axis=0 (default), then each column represents a variable, with observations in the rows. If axis=1, the relationship is transposed: each row represents a variable, while the columns contain observations. If axis=None, then both arrays will be raveled. nan_policy : 'propagate', 'raise', 'omit', optional Defines how to handle when input contains nan. The following options are available (default is 'propagate'):

* 'propagate': returns nan * 'raise': throws an error * 'omit': performs the calculations ignoring nan values

Returns ------- correlation : float or ndarray (2-D square) Spearman correlation matrix or correlation coefficient (if only 2 variables are given as parameters. Correlation matrix is square with length equal to total number of variables (columns or rows) in ``a`` and ``b`` combined. pvalue : float The two-sided p-value for a hypothesis test whose null hypothesis is that two sets of data are uncorrelated, has same dimension as rho.

References ---------- .. 1 Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000. Section 14.7

Examples -------- >>> from scipy import stats >>> stats.spearmanr(1,2,3,4,5, 5,6,7,8,7) (0.82078268166812329, 0.088587005313543798) >>> np.random.seed(1234321) >>> x2n = np.random.randn(100, 2) >>> y2n = np.random.randn(100, 2) >>> stats.spearmanr(x2n) (0.059969996999699973, 0.55338590803773591) >>> stats.spearmanr(x2n:,0, x2n:,1) (0.059969996999699973, 0.55338590803773591) >>> rho, pval = stats.spearmanr(x2n, y2n) >>> rho array([ 1. , 0.05997 , 0.18569457, 0.06258626], [ 0.05997 , 1. , 0.110003 , 0.02534653], [ 0.18569457, 0.110003 , 1. , 0.03488749], [ 0.06258626, 0.02534653, 0.03488749, 1. ]) >>> pval array([ 0. , 0.55338591, 0.06435364, 0.53617935], [ 0.55338591, 0. , 0.27592895, 0.80234077], [ 0.06435364, 0.27592895, 0. , 0.73039992], [ 0.53617935, 0.80234077, 0.73039992, 0. ]) >>> rho, pval = stats.spearmanr(x2n.T, y2n.T, axis=1) >>> rho array([ 1. , 0.05997 , 0.18569457, 0.06258626], [ 0.05997 , 1. , 0.110003 , 0.02534653], [ 0.18569457, 0.110003 , 1. , 0.03488749], [ 0.06258626, 0.02534653, 0.03488749, 1. ]) >>> stats.spearmanr(x2n, y2n, axis=None) (0.10816770419260482, 0.1273562188027364) >>> stats.spearmanr(x2n.ravel(), y2n.ravel()) (0.10816770419260482, 0.1273562188027364)

>>> xint = np.random.randint(10, size=(100, 2)) >>> stats.spearmanr(xint) (0.052760927029710199, 0.60213045837062351)

val theilslopes : 
  ?x:[> `Ndarray ] Np.Obj.t ->
  ?alpha:float ->
  y:[> `Ndarray ] Np.Obj.t ->
  unit ->
  float * float * float * float

Computes the Theil-Sen estimator for a set of points (x, y).

`theilslopes` implements a method for robust linear regression. It computes the slope as the median of all slopes between paired values.

Parameters ---------- y : array_like Dependent variable. x : array_like or None, optional Independent variable. If None, use ``arange(len(y))`` instead. alpha : float, optional Confidence degree between 0 and 1. Default is 95% confidence. Note that `alpha` is symmetric around 0.5, i.e. both 0.1 and 0.9 are interpreted as 'find the 90% confidence interval'.

Returns ------- medslope : float Theil slope. medintercept : float Intercept of the Theil line, as ``median(y) - medslope*median(x)``. lo_slope : float Lower bound of the confidence interval on `medslope`. up_slope : float Upper bound of the confidence interval on `medslope`.

See also -------- siegelslopes : a similar technique using repeated medians

Notes ----- The implementation of `theilslopes` follows 1_. The intercept is not defined in 1_, and here it is defined as ``median(y) - medslope*median(x)``, which is given in 3_. Other definitions of the intercept exist in the literature. A confidence interval for the intercept is not given as this question is not addressed in 1_.

References ---------- .. 1 P.K. Sen, 'Estimates of the regression coefficient based on Kendall's tau', J. Am. Stat. Assoc., Vol. 63, pp. 1379-1389, 1968. .. 2 H. Theil, 'A rank-invariant method of linear and polynomial regression analysis I, II and III', Nederl. Akad. Wetensch., Proc. 53:, pp. 386-392, pp. 521-525, pp. 1397-1412, 1950. .. 3 W.L. Conover, 'Practical nonparametric statistics', 2nd ed., John Wiley and Sons, New York, pp. 493.

Examples -------- >>> from scipy import stats >>> import matplotlib.pyplot as plt

>>> x = np.linspace(-5, 5, num=150) >>> y = x + np.random.normal(size=x.size) >>> y11:15 += 10 # add outliers >>> y-5: -= 7

Compute the slope, intercept and 90% confidence interval. For comparison, also compute the least-squares fit with `linregress`:

>>> res = stats.theilslopes(y, x, 0.90) >>> lsq_res = stats.linregress(x, y)

Plot the results. The Theil-Sen regression line is shown in red, with the dashed red lines illustrating the confidence interval of the slope (note that the dashed red lines are not the confidence interval of the regression as the confidence interval of the intercept is not included). The green line shows the least-squares fit for comparison.

>>> fig = plt.figure() >>> ax = fig.add_subplot(111) >>> ax.plot(x, y, 'b.') >>> ax.plot(x, res1 + res0 * x, 'r-') >>> ax.plot(x, res1 + res2 * x, 'r--') >>> ax.plot(x, res1 + res3 * x, 'r--') >>> ax.plot(x, lsq_res1 + lsq_res0 * x, 'g-') >>> plt.show()

val tiecorrect : [> `Ndarray ] Np.Obj.t -> float

Tie correction factor for Mann-Whitney U and Kruskal-Wallis H tests.

Parameters ---------- rankvals : array_like A 1-D sequence of ranks. Typically this will be the array returned by `~scipy.stats.rankdata`.

Returns ------- factor : float Correction factor for U or H.

See Also -------- rankdata : Assign ranks to the data mannwhitneyu : Mann-Whitney rank test kruskal : Kruskal-Wallis H test

References ---------- .. 1 Siegel, S. (1956) Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill.

Examples -------- >>> from scipy.stats import tiecorrect, rankdata >>> tiecorrect(1, 2.5, 2.5, 4) 0.9 >>> ranks = rankdata(1, 3, 2, 4, 5, 7, 2, 8, 4) >>> ranks array( 1. , 4. , 2.5, 5.5, 7. , 8. , 2.5, 9. , 5.5) >>> tiecorrect(ranks) 0.9833333333333333

val tmax : 
  ?upperlimit:float ->
  ?axis:[ `I of int | `None ] ->
  ?inclusive:bool ->
  ?nan_policy:[ `Propagate | `Raise | `Omit ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  Py.Object.t

Compute the trimmed maximum.

This function computes the maximum value of an array along a given axis, while ignoring values larger than a specified upper limit.

Parameters ---------- a : array_like Array of values. upperlimit : None or float, optional Values in the input array greater than the given limit will be ignored. When upperlimit is None, then all values are used. The default value is None. axis : int or None, optional Axis along which to operate. Default is 0. If None, compute over the whole array `a`. inclusive : True, False, optional This flag determines whether values exactly equal to the upper limit are included. The default value is True. nan_policy : 'propagate', 'raise', 'omit', optional Defines how to handle when input contains nan. The following options are available (default is 'propagate'):

* 'propagate': returns nan * 'raise': throws an error * 'omit': performs the calculations ignoring nan values

Returns ------- tmax : float, int or ndarray Trimmed maximum.

Examples -------- >>> from scipy import stats >>> x = np.arange(20) >>> stats.tmax(x) 19

>>> stats.tmax(x, 13) 13

>>> stats.tmax(x, 13, inclusive=False) 12

val tmean : 
  ?limits:Py.Object.t ->
  ?inclusive:Py.Object.t ->
  ?axis:int ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  float

Compute the trimmed mean.

This function finds the arithmetic mean of given values, ignoring values outside the given `limits`.

Parameters ---------- a : array_like Array of values. limits : None or (lower limit, upper limit), optional Values in the input array less than the lower limit or greater than the upper limit will be ignored. When limits is None (default), then all values are used. Either of the limit values in the tuple can also be None representing a half-open interval. inclusive : (bool, bool), optional A tuple consisting of the (lower flag, upper flag). These flags determine whether values exactly equal to the lower or upper limits are included. The default value is (True, True). axis : int or None, optional Axis along which to compute test. Default is None.

Returns ------- tmean : float Trimmed mean.

See Also -------- trim_mean : Returns mean after trimming a proportion from both tails.

Examples -------- >>> from scipy import stats >>> x = np.arange(20) >>> stats.tmean(x) 9.5 >>> stats.tmean(x, (3,17)) 10.0

val tmin : 
  ?lowerlimit:float ->
  ?axis:[ `I of int | `None ] ->
  ?inclusive:bool ->
  ?nan_policy:[ `Propagate | `Raise | `Omit ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  unit ->
  Py.Object.t

Compute the trimmed minimum.

This function finds the miminum value of an array `a` along the specified axis, but only considering values greater than a specified lower limit.

Parameters ---------- a : array_like Array of values. lowerlimit : None or float, optional Values in the input array less than the given limit will be ignored. When lowerlimit is None, then all values are used. The default value is None. axis : int or None, optional Axis along which to operate. Default is 0. If None, compute over the whole array `a`. inclusive : True, False, optional This flag determines whether values exactly equal to the lower limit are included. The default value is True. nan_policy : 'propagate', 'raise', 'omit', optional Defines how to handle when input contains nan. The following options are available (default is 'propagate'):

* 'propagate': returns nan * 'raise': throws an error * 'omit': performs the calculations ignoring nan values

Returns ------- tmin : float, int or ndarray Trimmed minimum.

Examples -------- >>> from scipy import stats >>> x = np.arange(20) >>> stats.tmin(x) 0

>>> stats.tmin(x, 13) 13

>>> stats.tmin(x, 13, inclusive=False) 14

val trim1 : 
  ?tail:[ `Left | `Right ] ->
  ?axis:[ `I of int | `None ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  proportiontocut:float ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t

Slice off a proportion from ONE end of the passed array distribution.

If `proportiontocut` = 0.1, slices off 'leftmost' or 'rightmost' 10% of scores. The lowest or highest values are trimmed (depending on the tail). Slice off less if proportion results in a non-integer slice index (i.e. conservatively slices off `proportiontocut` ).

Parameters ---------- a : array_like Input array. proportiontocut : float Fraction to cut off of 'left' or 'right' of distribution. tail : 'left', 'right', optional Defaults to 'right'. axis : int or None, optional Axis along which to trim data. Default is 0. If None, compute over the whole array `a`.

Returns ------- trim1 : ndarray Trimmed version of array `a`. The order of the trimmed content is undefined.

val trim_mean : 
  ?axis:[ `I of int | `None ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  proportiontocut:float ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t

Return mean of array after trimming distribution from both tails.

If `proportiontocut` = 0.1, slices off 'leftmost' and 'rightmost' 10% of scores. The input is sorted before slicing. Slices off less if proportion results in a non-integer slice index (i.e., conservatively slices off `proportiontocut` ).

Parameters ---------- a : array_like Input array. proportiontocut : float Fraction to cut off of both tails of the distribution. axis : int or None, optional Axis along which the trimmed means are computed. Default is 0. If None, compute over the whole array `a`.

Returns ------- trim_mean : ndarray Mean of trimmed array.

See Also -------- trimboth tmean : Compute the trimmed mean ignoring values outside given `limits`.

Examples -------- >>> from scipy import stats >>> x = np.arange(20) >>> stats.trim_mean(x, 0.1) 9.5 >>> x2 = x.reshape(5, 4) >>> x2 array([ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]) >>> stats.trim_mean(x2, 0.25) array( 8., 9., 10., 11.) >>> stats.trim_mean(x2, 0.25, axis=1) array( 1.5, 5.5, 9.5, 13.5, 17.5)

val trimboth : 
  ?axis:[ `I of int | `None ] ->
  a:[> `Ndarray ] Np.Obj.t ->
  proportiontocut:float ->
  unit ->
  [ `ArrayLike | `Ndarray | `Object ] Np.Obj.t

Slice off a proportion of items from both ends of an array.

Slice off the passed proportion of items from both ends of the passed array (i.e., with `proportiontocut` = 0.1, slices leftmost 10% **and** rightmost 10% of scores). The trimmed values are the lowest and highest ones. Slice off less if proportion results in a non-integer slice index (i.e. conservatively slices off `proportiontocut`).

Parameters ---------- a : array_like Data to trim. proportiontocut : float Proportion (in range 0-1) of total data set to trim of each end. axis : int or None, optional Axis along which to trim data. Default is 0. If None, compute over the whole array `a`.

Returns ------- out : ndarray Trimmed version of array `a`. The order of the trimmed content is undefined.