Prateek Papriwal - Developing Accurate Probability Distribution Functions
Contents
Description
This is the report of Prateek Papriwal for the GSOC 2012 on the project "Distribution functions" detailed at:
Scilab is free open source software for numerical computation providing a powerful computing environment for engineering and scientific applications. The current list of distribution functions implemented is very small as compared to that of Matlab. My proposal is to add more Matlab-like pdf's,cdf's,invert cdf's and Rng's. The addition of above Matlab-like features would add more functionalities to the distribution functions toolbox of Scilab.
Deliverables
The features included in scilab are - Beta,Exponential,Gamma,LogNormal,Normal,Uniform. The structure of current toolbox comprises of the apifun module,assert module, content of help page, content of a test.
Following distribution functions (their pdfs,cdfs,icdfs,rngs) heve been implemented in Matlab but not in Scilab -- such as Binomial,Chi-square,Copula,Hypergeometric,Rayleigh,Weibull,Multinomial, Extreme Value, F probability function, Student's t probability density function,Geometric.
Though, My primary aim would be implementing Binomial ,Geometric,Hypergeometric,Chi-square,Weibull, F probability function ,Student's T probability function . The addition of above functions would be totally independent . In other words, The implementation of the above modules (macros(.sci),unit tests(.tst),help pages(.xml)) would be totally independent. The addition of above distribution functions would improve the functionality of the statistics module.
For each Distribution function we will have
-> Probability Distribution Function
-> Cumulative probability Function
-> Inverse CDF
-> the random number generator
-> the statistics(mean and variance)
Timeline
I would adopt the strategy of "Test Driven Development" for the implementation -->
--> Write a draft of the unit tests.
--> Write a draft of the help.
--> Code (macros(.sci), c sources of src)
--> Accordingly update the tests(The accuracy tests will be such that it determines the accuracy to 13 to 15 significant digits .)
--> Accordingly update the help
--> recode
--> documentation
The above strategy will be followed for implementation of each distribution function.
Apifun Module -- The goal of this toolbox is to check the input arguments in macros (.sci) . It checks whether the number of input arguments provided by the user is consistent with the number of the expected arguments.
Assert Module -- the goal of this toolbox is to provide functions to make testing easier. The functions of the assert module are designed to used in Scilab unit test files(.tst files)
Structure of Toolbox - The structure has the following components -- benchmark ,demos,doc,etc,help,macros,sci_gateway,src,tests,builder.sce,changelog.txt,license.txt,readme.txt
Task List
Week |
Tasks |
Description |
Status |
Results |
||||
8th-13th May |
ATOMS functioning |
got acquainted with several functionalities of ATOMS while installing ,loading several modules |
Done |
Bug[1] found in distfun module |
||||
|
distfun module On forge |
went through the coding style of existing functions in distfun module |
Done |
|
||||
|
binomial distribution |
strengthened my theoretical knowledge of binomial distribution |
Done |
|
||||
14th-20th May |
version control system |
installed svn client TortoiseSVN |
Done |
|
||||
|
svn checkout |
svn checkout the distfun module onto my local directory |
Done |
|
||||
|
visual studio |
installed visual studio for building the distfun module from the sources,compiled a debug version of distfun module |
Done |
Compiling Environment estbilished |
||||
|
Linking scilab to visual studio |
to enable debugging functionality |
Done |
|
||||
21st-27th May |
Patching the Bug[1]/Bug - 11127 |
corrected the compiling error present in src/cdflib.c,src/gwsupport.c,src/genrand.c,src/unifrng.c |
|
sript builder.sce ran but loader.sce is still generating error (Bug 11127) |
||||
|
Geometric Distribution Implementation |
Implemented geometric probability density,cumulative density , mean and variance , and inverse CDF |
Done |
|
||||
28th-3rd June |
Geometric Distribution Implementation |
geometric random generator,Unit tests implemented and their refs created, dataset written in .csv format |
Done |
All the tests passed |
||||
|
Documentation |
Documentation of geometric distribution functions |
Done |
|
||||
4th-10th June |
Addition of latex and examples |
Added latex and some more examples to the documentation of macros |
Done |
|
||||
|
Help pages |
Created .xml files(help pages) with the help of help_from_sci() function |
Done |
|
||||
|
Benchmarks |
Created geom.r file with R software to check the accuracy of geometric distribution |
Done |
|||||
|
Bug 11127 |
Bug fixed . distfun module now runs smoothly on linux as well |
Done |
the error in the loader.sce script also fixed |
||||
11th-17th June |
Small Updates in Geometric Distribution |
Updated examples and hence help pages,updated readme.txt as well |
Done |
|
||||
|
Added new version |
Added 0.4-1 version of distfun module on atom |
Done |
|||||
|
Bug 11127 |
In the version 0.4-1 the bug 11127 does not exist. |
Done |
Bug 11127 solved |
||||
|
Issue 760 |
The inverse beta cdf computes wrong result on linux 32 bits. |
not done |
Issue 760 - http://forge.scilab.org/index.php/p/distfun/issues/760/ |
||||
18th-24th June |
Binomial Distribution |
Added macros and corresponding tests of binomial distribution functions |
Done |
|
||||
|
Addition of more tests |
Added .csv files and binom.r benchmark for accuracy and compatibility |
Done |
|
||||
|
Help Pages |
Added help pages for binomial distribution macros |
Done |
|
||||
... |
..... |
..... |
.... |
.... |
||||
|
Implementing binomial Distribution |
will be updated |
Not Done |
|
||||
|
Implementing Hypergeometric Distribution |
will be updated |
Not Done |
|
||||
|
Implementing Chi-Square Distribution |
will be updated |
Not Done |
|
||||
|
Implementing binomial Distribution |
will be updated |
Not Done |
|
||||
|
Implementing Student's Distribution |
will be updated |
Not Done |
|
||||
|
Implementing F Distribution |
will be updated |
Not Done |
|
Links
[1 ]While on loading distfun module on scilab on linux 32-bit, an error pops up
-->atomsLoad('distfun'); Start Distfun Load macros Load gateways atomsLoad: An error occurred while loading 'distfun-0.2-1': link: The shared archive was not loaded: /home/hp/scilab-5.3.3/share/scilab/contrib/distfun/0.2-1/sci_gateway/c//../../src/c/libdistfun_c.so: cannot open shared object file: No such file or directory !--error 10000 at line 337 of function atomsLoad called by : atomsLoad('distfun');
This error suggests that the libdistfun_c.so file is missing
-->The above bug has been resolved now in the new version 0.4 of distfun module.
Weekly reports
Source Code of Other Implementations
Implementation proposals
Final Report
http://www.google-melange.com/gsoc/project/google/gsoc2012/papriwalprateek/32002