Design of experiments
Contents
Introduction
Design of experiments (DoE) defines a set of tools dedicated to produce a good set of experiments. These experiments (a set of tuning parameters of an engine for example) are then applied to a system and the output of the system is measured. The set of experiments + the measures are then used to build a statistical model of the system:
- we can use a polynomial to modelize the inputs / output relationship
- we can use neural networks
- we can use Kriging and other technics
Design of experiments is a very widespread tool in the industrial field (in automotive industry, DoE are used to tune a diesel engine, to find the best size of a bar, etc ...).
GSOC 2012
During the GSOC 2012, Maria Christopoulou has chosen this project :
https://scilab.gitlab.io/legacy_wiki/Contributor-DoE-GSOC2012
This work has been mentored by Michael Baudin and Yann Collette.
We chose to update a former project of Yann Collette, adding new features, improved implementations, new unit tests and new help pages. The goal was to provide the same functions as in Matlab. The first version (0.1) of the "scidoe" project was released 6th of August 2012 on ATOMS, with 14 functions:
- scidoe_fullfact: A full factorial design
- scidoe_ff2n: A full factorial design with 2 factors
- scidoe_bbdesign: Box-Behnken design
- scidoe_simplelinreg : univariate linear regression
- scidoe_multilinreg : multivariate linear regression
- scidoe_regress : multiple linear regression
- scidoe_regressprint : print regression summary
- scidoe_yates: Yates Algorithm
- scidoe_ryates: Inverse Yates Algorithm
- scidoe_compare : The default comparison function used in the sort-merge.
- scidoe_plotcube : Plots a d dimensions cube in [-1,1].
- scidoe_sort : A flexible sorting function.
- scidoe_sortdesign : Sort the experiments of a design of experiments
- scidoe_string : Sort the experiments of a design of experiments
The second version of the "scidoe" toolbox (v0.2) was released 15th of February 2013, with 5 new functions:
- scidoe_ccdesign — A Central Composite Design of Experiments
- scidoe_fracfact — Fractional Factorial Design
- scidoe_lhsdesign — Latin Hypercube Sampling
- scidoe_star — Produces a star point design of experiments
- scidoe_pdist — Pairwise point distances of a matrix
What do we need in 2013
A list of functions has been identified for a next project:
scidoe_pdist: A C-based implementation is required, see http://forge.scilab.org/index.php/p/scidoe/issues/879/
Optimal Designs
- scidoe_optdesign: Optimal design (a-optimal)
- scidoe_optdesign: Optimal Design based on a criterion.
- mtlb_doptdesign: Matlab compatible D-optimal Design
Supersaturated Designs
- scidoe_comp_ssd: Supersaturated Design ('a-optimal').
- scidoe_comp_ssd: Supersaturated Design based on a criterion.
Model Building
- scidoe_poly_model: Produces a polynomial model
- scidoe_model_select: Produces a new polynomial model using forward or backward selection.
- scidoe_plot_model: Plots regression line and residuals distribution
- scidoe_build_regression_matrix: Regression matrix of a model
- scidoe_var_regression_matrix: Regression matrix of the variance of a model
- scidoe_lars: Least Angle Regression or Lasso Regression
- scidoe_rsquared: R2 Computation
General Functions
- scidoe_unnorm_doe_matrix: Adjusts high and low values of a design to specified maximum and minimum values
- scidoe_comp_WD2_crit: Wrap-around L2 discrepancy criterion
- scidoe_comp_CL2_crit: Centered L2 discrepancy criterion
- scidoe_crossvalidate: K-flod cross validation
- scidoe_cvplot: Plots cross validation results
- scidoe_prbs: A pseudo random binary signal generator
- scidoe_merge: Merges two samples
- scidoe_diff: Computes the difference of two samples
- scidoe_scramble: Permutes a sample
- scidoe_standardize: Center and normalize a sample
- scidoe_normalize: Normalises a sample
Related Potential Projets
These projects are not DOE, but may be used in the context of DOE.
- Several linear regression methods (I am mostly think about the LASSO method - penalized regression).
- the bootstrap method to compute the sensitivity of one statistic measure against the data.
Sources of inspiration
There are a lot of implementations available on this topic.
Design of experiments tools are available in Matlab:
http://www.mathworks.com/help/toolbox/stats/fracfactgen.html
http://www.mathworks.fr/fr/help/stats/d-optimal-designs-1.html (D-optimal designs)
These are Scilab examples of implementations which could be adapted, documented and packaged easily:
http://lolimot.cvs.sourceforge.net/viewvc/lolimot/scilab/doe
http://forge.scilab.org/index.php/p/nisp/source/tree/HEAD/macros/nisp_buildlhs.sci
http://forge.scilab.org/index.php/p/nisp/source/tree/HEAD/src/src/cpp/nisp_gva.cpp
Some packages by Burkardt in Fortran or C++ :
These are sources of inspirations:
the DAKOTA project
The R-project is also a good source of ideas (for implementation of design of experiments, for codes for LASSO regression). The page
http://cran.r-project.org/web/views/ExperimentalDesign.html
gives an overview of the experimental designs available in R.
There are good C++ implementations of "on the edge" statistical methods in the ROOT framework.
Some R packages provide designs of experiments:
DiceDesign: Designs of Computer Experiments, http://cran.r-project.org/web/packages/DiceDesign/index.html, http://cran.r-project.org/web/packages/DiceDesign/DiceDesign.pdf
FrF2: Fractional Factorial designs with 2-level factors, http://cran.r-project.org/web/packages/FrF2/index.html
DoE.base: Full factorials, orthogonal arrays and base utilities for DoE packages, http://cran.r-project.org/web/packages/DoE.base/index.html
DoE.wrapper: Wrapper package for design of experiments functionality, http://cran.r-project.org/web/packages/DoE.wrapper/index.html
Authors
- 2012 - 2013 - Michaël Baudin
- 2011 - DIGITEO - Michaël Baudin
- DIGITEO - Yann Collette