 [Contents] [TitleIndex] [WordIndex]

# The Ideal Statistics Module

## Abstract

In this document, we design an hypothetical "ideal" statistics module for Scilab.

## Introduction

The goal of this document is to design an hypothetical "ideal" statistics module for Scilab. First, we analyse the limitations of the current statistics features provided by Scilab, by Stixbox and by other toolboxes. In the second part, we present the ideal statistics module and its features.

The goal of this document is not to provide an analysis of the current features in this field (see Documents and tutorials for Probabilities and Statistics in Scilab on this topic).

## Issues with existing tools

### Issues with Scilab

Here are the current functions in the statistics section.

Central Tendency:

• geomean — geometric mean
• harmean — harmonic mean
• mean — mean (row mean, column mean) of vector/matrix entries
• meanf — weighted mean of a vector or a matrix
• trimmean — trimmed mean of a vector or a matrix

Measures of Dispersion:

• iqr — interquartile range
• mad — mean absolute deviation
• strange — range

Measures of Shape:

• cmoment — central moments of all orders
• moment — non central moments of all orders
• perctl — computation of percentils
• quart — computation of quartiles

Data with Missing Values:

• nancumsum — Thos function returns the cumulative sum of the values of a matrix
• nand2mean — difference of the means of two independent samples
• nanmax — max (ignoring Nan's)
• nanmean — mean (ignoring Nan's)
• nanmeanf — mean (ignoring Nan's) with a given frequency.
• nanmedian — median of the values of a numerical vector or matrix
• nanmin — min (ignoring Nan's)
• nanstdev — standard deviation (ignoring the NANs).
• nansum — Sum of values ignoring NAN's
• thrownan — Eliminates nan values

Descriptive Statistics

• center — center
• wcenter — center and weight
• correl — correlation of two variables
• covar — covariance of two variables
• median — median
• msd — mean squared deviation
• mvvacov — computes variance-covariance matrix
• stdevf — standard deviation
• st_deviation — standard deviation
• variance — variance of the values of a vector or matrix
• variancef — standard deviation of the values of a vector or matrix

Summaries

• tabul — frequency of values of a matrix or vector
• nfreq — frequence of the values in a vector or matrix

Sampling

• sample — Sampling with replacement
• samplef — sample with replacement from a population and frequences of his values.
• samwr — Sampling without replacement

Principal Component Analysis

• pca — Computes principal components analysis with standardized variables
• princomp — Principal components analysis
• show_pca — Visualization of principal components analysis results

Hypothesis Testing

• ftest — Fisher ratio
• ftuneq — Fisher ratio for samples of unequal size.

Regression

• regress — regression coefficients of two variables

Distribution functions:

• cdfbet — cumulative distribution function Beta distribution
• cdfbin — cumulative distribution function Binomial distribution
• cdfchi — cumulative distribution function chi-square distribution
• cdfchn — cumulative distribution function non-central chi-square distribution
• cdff — cumulative distribution function F distribution
• cdffnc — cumulative distribution function non-central f-distribution
• cdfgam — cumulative distribution function gamma distribution
• cdfnbn — cumulative distribution function negative binomial distribution
• cdfnor — cumulative distribution function normal distribution
• cdfpoi — cumulative distribution function poisson distribution
• cdft — cumulative distribution function Student's T distribution

Random number generators:

• grand
• rand

Scilab provides several probability and statistical features and provides several distribution functions:

• CDF: cumulated density function (e.g. the cdfnor function),
• iCDF: inverse cumulated density function (e.g. the cdfnor function),
• RNG: random number generator (the grand function).

We can see the statistics-related bugs in bugzilla at:

In fact, a detailed analysis shows that the existing features would be easily enhanced on the following points.

See the Statistics category in the bug reports for a complete reports of the bugs:

The bottom line is that distfun outperforms Scilab on CDFs, iCDF (quantile) and RNGs functions.

### Issues with Stixbox

Here are the functions provided by Stixbox.

• Datasets
• getdata — Returns a dataset.
• Graphics
• stixbox_graphics — Demos of the graphics.
• bubblechart — Plot a bubble chart
• bubblematrix — Plot a bubble chart matrix
• histo — Plot a histogram
• identify — Identify points on a plot with mouse clicks
• plotmatrix — Plot an X vx Y scatter plot matrix
• plotsym — Plot with symbols
• qqnorm — Normal probability paper
• qqplot — Create a QQ-plot
• stairs — Stairstep graph
• Logistic Regression
• lodds — Log odds function
• loddsinv — inverse of log odd
• logitfit — Fit a logistic regression model
• Miscellaneous
• betaln — Logarithm of beta function
• corrcoef — Correlation coefficient
• cov — Covariance matrix
• ksdensity — Kernel smoothing density estimate
• quantile — Empirical quantile
• Polynomials
• polyfit — Polynomial curve fitting
• polyval — Polynomial evaluation
• Regression
• cmpmod — Compare linear submodel versus larger one
• lsfit — Fit a multiple regression normal model
• lsselect — Select a predictor subset for regression
• regres — Multiple linear regression
• regresprint — Print linear regression
• Resampling Techniques
• stixbox_resamplingT — How to use extra-arguments in T.
• ciboot — Bootstrap confidence intervals
• covboot — Bootstrap estimate of the variance of a parameter estimate.
• covjack — Jackknife estimate of the variance of a parameter estimate.
• rboot — Simulate a bootstrap resample
• stdboot — Bootstrap estimate of the parameter standard deviation.
• stdjack — Jackknife estimate of the standard deviation of a parameter estimate.
• Tests
• ciquant — Nonparametric confidence interval for quantile
• kstwo — Kolmogorov-Smirnov statistic from two samples
• test1b — Bootstrap test that mean equals zero
• test1n — Test that mean equals zero (Normal)
• test1r — Test for median equals 0 using rank test
• test2n — Tests two normal samples with equal variance
• test2r — Test location equality of two samples using rank test

There are many issues with Stixbox.

• Some help pages are almost empty (for example, the cmpmod function: http://forge.scilab.org/index.php/p/stixbox/issues/1082/)

• Some unit tests are missing
• The arguments of some functions is unchecked (e.g. number of input / output arguments, type, size, content).
• The function are not localized.
• Some functions are duplicated in Scilab.

The issues of the Stixbox are reported at:

### Issues with regtools

Regtools is a toolbox which is packaged on Atoms:

The regtools module provides the following functions:

• linregr : an interactive user interface for linear regression analysis, including plot facilities and the most relevant statistical information at the solution.
• nlinregr : an interactive user interface for performing non linear (weighted) regression analysis. Also here plot facilities and statistical information are available. Both functions can be called in silent command line mode.
• nlinlsq : non linear (weighted) regression analysis function - called by nlinregr(). nlinlsq() uses the scilab function optim() for solving the regression problem. Supports both analytical and numerical derivatives.
• qqplot - quantile-quantile plots.

A review has been done at:

There are several issues with Regtools.

• The project is not developped on the Forge, so that we cannot report bugs easily.
• On the design, regtools provides the qqplot function, which should be in a statistical graphics toolbox and not in regtools.
• There is no unit test.
• The functions contains both the computations, the message prints and the GUI.
• The function names are too short and may interact with other toolboxes.
• The tests do not use the nistdataset module (http://forge.scilab.org/index.php/p/nistdataset/), while they could.

• The nlinlsq function uses optim, while it should use lsqrsolve (although leastsq may be a good choice too). See Non linear optimization for parameter fitting example for details.

### Issues with CASCI

The CASCI toolbox includes various functions for probability & statistics that are used by P. Castagliola's lab at Université de Nantes.

The toolbox is developped on the Forge:

The CASCI toolbox provides the following functions:

• intbinomial - binomial confidence interval
• intexponential - exponential confidence interval
• intnormalm - normal confidence interval for µ
• intnormals - normal confidence interval for s
• intpoisson - Poisson confidence interval
• cdfbeta - beta type 1 cdf
• cdfbeta2 - beta type 2 cdf
• cdfbinomial - binomial cdf
• cdfchi2 - ?2 (central and non-central) cdf
• cdfcv - sample coefficient of variation cdf
• cdfdphase - discrete Phase-Type cdf
• cdfexponential - exponential cdf
• cdffisher - Fisher (central and non-central) cdf
• cdffoldednormal - folded normal cdf
• cdfgamma - gamma cdf
• cdfgev - generalized Extreme Value cdf
• cdfhypergeometric - hypergeometric cdf
• cdfjohnson - Johnson’s cdf
• cdflognormal - lognormal cdf
• cdfmedian - normal sample median cdf
• cdfnormal - normal cdf
• cdfpareto - Pareto cdf
• cdfpascal - Pascal cdf
• cdfpoisson - Poisson cdf
• cdfrnge - normal range cdf
• cdfstandev - normal sample standard-deviation cdf
• cdfstudent - Student (central and non-central) cdf
• cdfweibull - Weibull cdf
• boxbehnken - Box-Behnken designs
• boxcoxlinear - Box-Cox linearity transformation
• centralcomposite - central composite designs
• coded2natural - coded to natural variables
• doxpand - design expansion
• doxptim - design optimisation
• factorial2 - two levels full and fractional factorial designs
• mulreg - multilinear regression analysis
• mulregdisp - multilinear regression analysis results display
• mulregplot - multilinear regression analysis results plot
• natural2coded - natural to coded variables
• plackettburman - Plackett-Burman designs
• simpdex - simplex designs
• fitbeta - beta type 1 parameters estimation
• fitgamma - gamma parameters estimation
• fitgev - generalized Extreme Value parameters estimation
• fitjohnson - Johnson parameters estimation
• fitlognormal - lognormal parameters estimation
• fitweibull - Weibull parameters estimation
• idfbeta - beta type 1 idf
• idfbeta2 - beta type 2 idf
• idfchi2 - ?2 (central and non-central) idf
• idfcv - sample coefficient of variation idf
• idfexponential - exponential idf
• idffisher - Fisher (central and non-central) idf
• idfgamma - gamma idf
• idfgev - generalized Extreme Value idf
• idfjohnson - Johnson’s idf
• idflognormal - lognormal idf
• idfmedian - normal sample median idf
• idfnormal - normal idf
• idfpareto - Pareto idf
• idfstandev - normal sample standard-deviation idf
• idfstudent - Student (central and non-central) idf
• idfweibull - Weibull idf
• allcombination - matrix element combinations
• allpermutation - matrix element permutations
• arrangement - number Ap of arrangements
• combination - number Cn of combinations
• confhyper - confluent hypergeometric function
• depth - non parametric multivariate depth
• hausdorff - Hausdorff (median) distance between polylines
• lowess - LOcally WEighted Scatterplot Smoothing
• momdphase - first moments of a Discrete Phase-Type distribution
• nearestneighbors - find the k nearest neighbors
• savitzkygolay - Savitzky-Golay smoothing filter
• simplex - simplex computation
• simplexolve - solve a system of non-linear equations
• torczon - Torczon’s multidirectional nonlinear optimization algorithm
• vandercorput - Van der Corput’s sequence
• boxplot - Box plot
• qplot - quantile plot
• qqplot - quantile-quantile plot
• pdfbeta - beta type 1 pdf
• pdfbeta2 - beta type 2 pdf
• pdfbinomial - binomial pdf
• pdfchi2 - ?2 (central and non-central) pdf
• pdfcp - CP pdf
• pdfcpk - CP K pdf
• pdfcpm - CP M pdf
• pdfcpmk - CP M K pdf
• pdfcpuv - V ¨nnman’s Cp (u, v) pdf
• pdfcv - sample coefficient of variation pdf
• pdfdphase - discrete phase-type pdf
• pdfexponential - exponential pdf
• pdffisher - Fisher (central and non-central) pdf
• pdffoldednormal - folded normal pdf
• pdfgamma - gamma pdf
• pdfgev - generalized Extreme Value pdf
• pdfhypergeometric - hypergeometric pdf
• pdfkernel - kernel smoothed pdf
• pdfjohnson - Johnson’s pdf
• pdflognormal - lognormal pdf
• pdfmedian - normal sample median pdf
• pdfmultinormal - multinormal pdf
• pdfnormal - normal pdf
• pdfpareto - Pareto pdf
• pdfpascal - Pascal pdf
• pdfpoisson - Poisson pdf
• pdfrnge - normal range pdf
• pdfstandev - normal sample standard-deviation pdf
• pdfstudent - Student pdf
• pdfweibull - Weibull pdf
• rndbeta - beta type 1 random number generator
• rndbeta2 - beta type 2 random number generator
• rndbinomial - binomial random number generator
• rndexponential - exponential random number generator
• rndfoldednormal - folded normal random number generator
• rndgamma - gamma random number generator
• rndgev - generalized Extreme Value random number generator
• rndjohnson - Johnson’s random number generator
• rndlognormal - lognormal random number generator
• rndmultinormal - multinormal random number generator
• rndnormal - normal random number generator
• rndpareto - Pareto random number generator
• rndpascal - Pascal random number generator
• rndpoisson - Poisson random number generator
• rndstandev - normal sample standard-deviation random number
• rndweibull - Weibull random number
• generator
• autocorrelation - autocorrelation coefficient
• bootstrap - bootstrap sampling
• correlation - correlation matrix
• crosscorrelation - crosscorrelation coefficient
• kurtosis - kurtosis coefficient
• quantile - quantile
• rnge - range
• skewness - skewness coefficient
• standev - standard deviation
• totalmedian - total median coefficients
• varcovar - variance-covariance matrix
• arlmean - ARL of the mean control chart
• arlmeanRR - ARL of the Run Rules mean control chart
• arlmedian - ARL of the median control chart
• arlmedianRR - ARL of the Run Rules median control chart
• arlrnge - ARL of the range control chart
• arlstandev - ARL of the standard-deviation control chart
• arlstandevRR - ARL of the Run Rules standard-deviation control chart
• cp - capability index CP estimation and confidence interval
• cpk - capability index CP K estimation and confidence interval
• krnge - range coefficients KR (n)
• kstandev - standard-deviation coefficients KS (n, r)
• mcpshahriari - Shahriari’s multivariate capability index CP
• mcptaam - Taam’s multivariate capability index CP
• andersondarling - Anderson-Darling’s normality test
• bartlett - Bartlett’s test
• grubbs - Grubbs test
• kendall - Kendall’s test
• levene - Levene’s test
• mardia - Mardia’s test
• spearman - Spearman’s test
• tstbinomial1 - binomial one sample p test
• tstbinomial2 - binomial two samples p test
• tstexponential - exponential ? test
• tstnormalm1 - normal one sample µ test
• tstnormalm2 - normal two samples µ test
• tstnormals1 - normal one sample s test
• tstnormals2 - normal two samples s test
• tstsku - normal skewness and kurtosis test
• waldwolfowitz - Wald-Wolfowitz’s run test
• wilcoxon1 - Wilcoxon’s one sample (paired) test
• wilcoxon2 - Wilcoxon’s two samples test

There are several issues with CASCI:

### Other tools

During the 2012 International Open Source Software Contest, a statistical toolbox was created:

• Authors: Wang Shuaili, Wu Wei, Li Weicai, from Ecole Centrale de Pékin, Beihang University

TODO : review this toolbox

## Existing statistical tools outside of Scilab

### Matlab

Matlab has a solid set of statistical functions.

The following page is the entry point for the statistical toolbox:

The following page presents the list of functions for hypothesis tests:

### R

The statistical features of R are huge, so that Scilab will probably never reach that level of specialization. Anyway, this is an excellent reference to look at.

The following page presents the list of distributions in R:

### Octave

The following modules and toolboxes for octave exists:

## The ideal statistics API

### Architecture of the modules

Rather than designing a single module, we may rather think of several separated modules, with clear and orthogonal goals. This restricts the work, increases the chances of reuse and limits the dependencies. For, it allows to separate the graphics issues from the computational issues. This is one of the main difficulty with several modules (e.g. Metanet), which has been a strong obstacle to their maintenance over the years.

Here are the modules that we could create.

• scidoe : design of experiments. Only the functions to create the designs should be here. There should be no function to build models (e.g. linear models).
• distfun : distribution functions. Only the functions to manage the PDF, CDF, iCDF RNG, and stats should be here. There should be no statistics functions in this module. This might be a problem, because testing these functions may require statistics functions, for example the cov function.
• datgra : graphics for data analysis. Only the functions to create the graphics should be here.
• regmod : regression analysis module. Only the functions to produce regression analysis should be here.

### Designs of experiments : scidoe

The scidoe toolbox was created for this purpose:

The goal of this toolbox is to provide design of experiments techniques, along with functions for model building.

This project is part of the GSOC 2012, managed by Maria Christopoulou.

Here is the list of functions which are available.

Factorial Design

• X = scidoe_fullfact(levels) // A full factorial design
• X = scidoe_ff2n(n) // A full factorial design with 2 factors

Response Surface Designs

• X = scidoe_bbdesign(n) // Box-Behnken design

Goals

Here is the list of functions which *will* be created at the end of the project.

Latin Hypercube Designs

• X = scidoe_lhsdesign(n,p) // LHS Design (without improvement).
• X = scidoe_lhsdesign(n,p,'criterion',criterion) // Optimized LHS Design based on a criterion. Criterion can be 'maximin','correlation' or 'centered'.

Factorial Design

• X = scidoe_fracfact(generators) // A fractional factorial design
• X = scidoe_star(nb_var) // Star Design of Experiments

Response Surface Designs

• X = scidoe_ccdesign(n) // Central composite design

Optimal Designs

• X = scidoe_optdesign(n) // Optimal design (a-optimal)
• X = scidoe_optdesign(n,'criterion',criterion) // Optimal Design based on a criterion. Criterion can be 'a' for A-Optimal, 'd' for D-Optimal, 'g' for G-Optimal and 'o' for O-Optimal Design.
• X = mtlb_doptdesign(n) // Matlab compatible D-optimal Design

Supersaturated Designs

• X = scidoe_comp_ssd(M_doe, model) // Supersaturated Design ('a-optimal').
• X = scidoe_comp_ssd(M_doe, model,'criterion',criterion) // Supersaturated Design based on a criterion. Criterion can be 'a-optimal', 'd-optimal', 'average-khi2', 'maximum-khi2', 'r-value'(correlation coefficient).

Model Building

• X = scidoe_poly_model(mod_type, nb_var, order) // Produces a polynomial model
• X = scidoe_model_select(nb_var, model_old, measures, Log,criterion) // Produces a new polynomial model using forward or backward selection. Default is 'forward'.
• X = scidoe_plot_model(meas_learn, estim_learn, meas_valid, estim_valid) // Plots regression line and residuals distribution
• X = scidoe_build_regression_matrix(H,model,build) // Regression matrix of a model
• X = scidoe_var_regression_matrix(H, x, model, sigma) // Regression matrix of the variance of a model
• X = scidoe_lars(X, y, method, stop, useGram, Gram, _trace) // Least Angle Regression or Lasso Regression
• X = scidoe_rsquared(Y,Y_model) // R2 Computation

General Functions

• X = scidoe_unnorm_doe_matrix(H, min_levels, max_levels) // Adjusts high and low values of a design to specified maximum and minimum values
• scidoe_comp_WD2_crit.sci // Wrap-around L2 discrepancy criterion
• scidoe_comp_CL2_crit(Data).sci // Centered L2 discrepancy criterion
• scidoe_crossvalidate.sci // K-flod cross validation
• scidoe_cvplot.sci // Plots cross validation results
• scidoe_prbs.sci // A pseudo random binary signal generator
• scidoe_merge.sci // Merges two samples
• scidoe_diff.sci // Computes the difference of two samples
• scidoe_scramble.sci // Permutes a sample
• scidoe_standardize.sci // Center and normalize a sample
• scidoe_normalize.sci // Normalises a sample

### Distribution functions : distfun

The goal of the distribution function toolbox is to provide the following functions:

• PDF: probability density function,
• CDF: cumulated density function,
• iCDF: inverse cumulated density function,
• RNG: random number generator.

This section is the goal of the distfun project. This project is developped on the Forge:

and available on Atoms:

This project is part of the GSOC 2012, managed by Prateek Papriwal.

For each distribution x, we provide five functions :

• distfun_xcdf — x CDF
• distfun_xinv — x Inverse CDF
• distfun_xpdf — x PDF
• distfun_xrnd — x random numbers
• distfun_xstat — x mean and variance

Distributions available :

• Beta (with x=beta)
• Exponential (with x=exp)
• Gamma (with x=gam)
• Geometric (with x=geo)
• LogNormal (with x=logn)

• Normal (with x=norm)
• Uniform (with x=unif)

Support

• distfun_erfcinv — Inverse erfc function
• distfun_getpath — Returns path of current module

Random Number Generator

• rng_overview — An overview of the Random Number Generators of the Distfun toolbox.
• distfun_genget — Get the current random number generator
• distfun_genset — Set the current random number generator
• distfun_seedget — Get the current state of the current random number generator
• distfun_seedset — Set the current state of the current random number generator
• distfun_streamget — Get the current stream
• distfun_streaminit — Initializes the current substream
• distfun_streamset — Set the current stream

Still, the work is not finished and there are many distributions which are still missing in the distfun module.

### Datasets

In this section, we present functions to manage datasets.

This section is a *DRAFT*.

• getdata : Famous datasets

#### Datasets distributed alongside R : rdataset

rdataset is a collection of 597 datasets that were originally distributed alongside the statistical software environment "R" and some of its add-on packages.

Datasets which are available in R can be used in Scilab with rdataset. The toolbox needs around 50 MByte. The datasets can be used in order to tests statistical function in scilab and compare the results with the output of R. As the dastasets are included in R, no data have no loaded manually.

This project is developped on the Forge:

For example the dataset survey from the library MASS can be loaded using:

### Statistical visualization : statvis

In this section, we present functions to produce statistical graphics.

This section is a *DRAFT*.

• statvis_identify : Identify points on a plot by clicking with the mouse (draft from Stixbox)
• statvis_plotsym : Plot with symbols (draft from Stixbox)
• statvis_qqnorm : Normal probability paper (draft from Stixbox)
• statvis_qqplot : Plot empirical quantile vs empirical quantile (draft from Stixbox, from Nan-Toolbox)
• statvis_boxplot : Draw a box-and-whiskers plot for data provided as column vectors (draft from Stixbox)
• statvis_cdfplot : plots empirical commulative distribution function (draft from Stixbox)
• statvis_normplot : Produce a normal probability plot for each column of X (draft from Stixbox)
• statvis_plotmatrix : Scatter plot matrix - http://www.mathworks.fr/help/techdoc/ref/plotmatrix.html (draft from Stixbox, from Nan-Toolbox)

• statvis_cdfplot : http://www.mathworks.fr/fr/help/stats/cdfplot.html. (draft = nan_cdfplot from Nan-Toolbox)

• statvis_gscatter : http://www.mathworks.fr/fr/help/stats/gscatter.html (draft = nan_gscatter from Nan-Toolbox)

• statvis_boxplot
• statvis_normplot
• statvis_andrewsplot
• statvis_hist : http://www.mathworks.fr/fr/help/matlab/ref/hist.html (draft = histo from Stixbox, and nan_hist from Nan-Toolbox)

• statvis_ecdfhist
• statvis_fscatter3
• statvis_gplotmatrix
• statvis_parallelcoords
• statvis_errorb
• statvis_errorbar
• statvis_nhist
• statvis_bubblechart — Plot a bubble chart
• statvis_bubblematrix — Plot a bubble chart matrix
• statvis_inthisto : Discrete histogram (draft is distfun_inthisto in distfun)

### Descriptive statistics

In this section, we present functions to produce descriptive statistics.

This section is a *DRAFT*.

• Location Measures
• mean :
• median :
• rms : root mean square
• Dispersion Measures
• standardDeviation
• meandev : MeanDeviation

• mediandev : MedianDeviation

• mad : estimates the Mean Absolute deviation
• iqr : calculates the interquartile range
• Shape Measures
• Skewness
• kurtosis : estimates the kurtosis
• moment
• variance
• General Measures
• quantile : Empirical quantile (percentile).
• sem : calculates the standard error of the mean
• Dependance
• cov : Covariance matrix. The Stixbox/cov function does this.
• corrcoef: calculates the correlation matrix from pairwise correlations. The Stixbox/corrcoef function does this. (gives back the confidence interval of estimated parameters and the R^2, F and p values)
• fss : feature subset selection and feature ranking
• parcoef : partial correlation
• Resampling
• ciboot : Various bootstrap confidence intervals
• covboot : Bootstrap estimate of the variance of a parameter estimate
• covjack : Jackknife estimate of the variance of a parameter estimate
• rboot : Simulate a bootstrap resample from a sample
• stdboot : Bootstrap estimate of the parameter standard deviation
• stdjack : Jackknife estimate of the standard deviation of a parameter estimate
• Miscellaneous
• zscore : removes the mean and normalizes the data to a variance of 1
• zScoreMedian : removes the median and standardizes by the 1.483*median absolute deviation
• kappa : estimates Cohen's kappa coefficient and related statistics

### Hypothesis testing : hypt

In this section, we present functions which compute tests, confidence intervals and model estimation.

One of the principles of this toolbox is to be compatible with MATLAB, especially the Hypothesis Testing sub-module:

This section is a *DRAFT*.

High Priority List:

Low Priority List:

Excluded The following functions will not be part of the hypt toolbox:

• lsfit : Fit a multiple regression normal model. This function is more appropriate in the "regression" toolbox.
• lsselect : Select a predictor subset for regression. This function is more appropriate in the "regression" toolbox.

## TODO

Design:

• Find a toolbox name for "Descriptive statistics"
• Find a toolbox name for "Hypothesis testing" (maybe hypt?)
• Rename the function to match Matlab
• Insert a prefix to avoid name conflicts
• Indicate potential implementations

Development:

• Create the associated missing projects on the Forge
• write unit tests (As reference R could be used)
• create demos
• create help pages for each function

## Authors

• 2012-2013, Michaël Baudin
• 2012, Holger Nahrstaedt

2022-09-08 09:27