Shashank Sahni - Developing Accurate and Portable Elementary Functions for Scilab
Introduction
This page is the workbench page for the project : https://scilab.gitlab.io/legacy_wiki/Contributor%20-%20APEF
Description of the Project
Scilab is a numerical computing package and provides a high level numerically oriented programming language for numerical calculations, simulations and scientific computing. Most of these calculations require a high amount of precision. The current elementary functions - sin, cos, pow etc. are dependent on the ones provided by maths library on differnet platforms. On linux, its libm from gcc and on windows its from Microsoft Visual C compiler (and Intel Compiler). This project aims to provide a common library to ensure same results, with acceptable variations, across different platforms. We’ll use external math libraries which boast of being bug free and precise. The task is to provide a link between the scilab functions and the corresponding functions of the ext. library. It’ll require re-writing the functions and modifying the current implementations to make this work.
Technical Details
The libraries proposed by the orgranization are -
- fdlibm
- libmcr
- crlibm
I have decided to go with fdlibm. The choice was initially driven by the plan for using only one library to provide all the functions. This keeps the development simple and ensures same results on every platform. Other factors include -
- fdlibm provides a lot of functions and can be used to port almost all elementary maths functions provided by scilab.
- fdlibm is multi standard compliant - IEEE, POSIX/ANSI, SVID, X OPEN. Using this feature, an option could be added to return the result based on a standard. This will be decided after studying the requirements of users and current policy of scilab.
- fdlibm is used by Matlab for providing trigonometric functions. This is a vital point because of it already being successfully used in another numerical computation package.
- libmcr is distributed as a source which can be ported to multiple machines. I preferred fdlibm because of no building or porting requirements. In addition to that, libmcr doesn’t come closer to the number of functions provided by fdlibm.
- crlibm is a well tested library and provides support for all the functions that require high precision and have known to be buggy. But due to more function provision, multi-standard compliance and already being used in a numerical computing package, I preferred fdlibm.
If during the course of implementation or testing, some functions in fdlibm are found to be buggy or provide unacceptable variation in results on different platforms then we can either drop fdlibm or make use of crlibm to provide the lacking support. The aim is to provide computation precision, so it’ll always be preferred over ease of development and maintenance.
Programming Languages
Almost all the functions that require change are provided by the elementary_functions module. Majority of those functions are written in Fortran. Because of my little experience with Fortran and fdlibm library being in C, I’ll re-write the Fortran portion of the module in C. This will require changes in the scilab gateway functions but will be helpful in making the code flexible, e.g features like multi-standard option, could be easily added and it'll also lead to translation of french documentation to English in each Fortran code This project only requires adding a common math library for scilab. Hence, no additional or new features are defined or even needed as of now.
Timeline
- 24th - 30th April(1 week) - This period will include study of the scilab api which will be required in making the gateway functions.
- May 1 - May 23(3 weeks)(Before the official coding period) - This session will require working closely with the users and developers to come up with a final list of functions that need to be linked to the fdlibm library. The interaction here could be a deciding factor for the inclusion of one or more libraries.
- May 23 - June 30(4-5 weeks) - These 5 weeks will be a complete coding phase. I plan to have re-written and linked all the functions(listed in the previous phase) by now. I'll keep performing basic testing of functions while development but a complete testing(by users and other developers) won’t be possible because of undeveloped module.
- July 1 - July 15(2 weeks)(Before mid term evaluation) - At the beginning of this phase, I would have a working module. I’ll work with other developers and users for testing the functions and ensure that the result variation(if any) on different platforms is acceptable. Testing this early is required because if something wrong is found with the core maths library then it can be easily fixed by incorporating another one, preferably crlibm. If things go as planned, before the mid-term evaluation, I should have almost all the functions linked and thoroughly tested for correctness.
- July 16 - Aug 9(2-3 weeks) - This will primarily be an extension for continued testing, but will also include documentation. I prefer documenting during development, hence developer documentation would have been almost done till this phase. Since, we won’t be making any changes to the user functions, unless a new feature is included, user documentation shouldn’t be affected much.
Detailed list of tasks
Tasks |
Description |
Status (%) |
Output |
Prioritize the functions to be included in scilab |
Out of all the functions provided by fdlibm, we'll sort out the ones we plan to include in scilab |
DONE |
A 1/2 page report, with 3 list of functions. |
Investigate the packaging of fdlibm |
Since we are planning to use fdlibm as part of the project, we should know who is the upstream developer. The one responsible for its updates, releases etc. |
Done |
|
Compilation |
Scilab is a cross-platform application. This task deals with the compilation requirements of fdlibm on all platforms. |
Done |
|
Accuracy Tests |
To develop a suite of accuracy tests for the most common functions. |
Done |
A set of .tst unit test scripts (e.g. sin.tst) |
Performance Tests |
Due to lack of standard benchmarks, the current plan is to measure the performance of new math library by comparing the execution time and complexity of the new functions with the ones in previous release. |
Basic testing is done |
A 1 page report, with 1 benchmark .sce script. |
Porting the code from Fortran to C |
Most of the functions in the elementary_functions module are written in Fortran. Because of my little experience with Fortran and fdlibm library being in C, I plan to re-write the Fortran portion of the module in C. |
Optional |
|
List of Functions to be changed(categorized by priority)
High Priority |
Less |
Lesser |
sqrt |
sinh, sinhm |
frexp |
pow |
cosh, coshm |
erf |
exp |
tanh, tanhm |
gamma |
log, log10 |
asinh, asinhm |
ceil |
sin, sind, sinm |
acosh, acoshm |
floor |
cos, cosd, cosm |
atanh, atanhm |
isNaN |
tan, tand, tanm |
|
|
asin, asind, asinm |
|
|
acos, acosd, acosm |
|
|
atan, atand, atanm |
|
|
Investigating the packaging of fdlibm
Netlib
The major releases of Fdlibm are - 5.1, 5.2 & 5.3. Sources - Fdlibm is always referred to as the package provided by the netlib library. Its source can be downloaded from netlib repos and validlab.
The readme of 5.3 source mentions a contact email address, fdlibm-comments@sun.com, for sending comments and bugs. As per the date in fdlibm.h, v5.3 was probably released sometime in 2004 and was ported in JAVA in 2009.
Their FAQ provides a lot of useful information. Here is a list of maintainers - http://www.netlib.org/utk/icl/maintainers.html Maintainers mailing list - netlib_maintainers@netlib.org
A .master file available in the package has the name and email-ID of the developer Kwok C Ng <kwok.ng@sun.com> I left a mail on the fdlibm-comments enquiring about any announcement group or version management, if available. Here's the reply from David Hough.
fdlibm is not supported by Oracle. Nor is there any organized public maintenance or development effort elsewhere that we are aware of. Feel free to start one!
The mailing list numeric-interest@ucbtest.org could be used for preliminary discussions of such efforts. But that mailing list is almost inactive. Information is at http://mailman.oakapple.net/mailman/listinfo/numeric-interest
Its also distributed as a gnuwin32 package.
Sun
For the mathematical functions in JAVA, StrictMaths class is used. Its description says that in order to maintain portability of programs, some functions in this class required that they produce the same result as certain published algorithms, which are available in the package fdlibm provided by the well-known network library, netlib.
Developer(StrictMaths) - John D. Darcy Developer discussion mailing list - jdk6-dev@openjdk.java.net
Regarding the version management and current maintenance - openjdk’s StrictMaths library seems to be the only under maintenance currently. Although, we shouldn’t forget that JAVA has long been making use of fdlibm. A bug posted in 2004 stated this point. This shows that they have been considering fdlibm as a standard reference for various releases. So, if we couldn’t find any development source of fdlibm to keep track of, following openjdk’s development would be a good choice.
Performance Tests
All the above tests proved that fdlibm library is accurate enough. The next step is performance testing. I searched a lot for any available benchmarks for testing the performance of elementary functions. Almost all of them tested accuracy and performance comparison among various programming languages. Since, we are primarily interested in performance comparison between the two versions of Scilab, I took a different approach. The scripts that I included have immense data samples, but they only deal with the accuracy and return pass or fail. I tweaked them a little to return nothing and calculated the time it took to run them on a regular scilab as well as on an fdlibm integrated one.
Here are the results for each function.
Function |
Time by scilab+fdlibm |
Time by scilab+libm |
Time difference |
No. of input samples |
% increase in time of fdlibm with reference to libm |
acos |
0.86 |
0.77 |
0.09 |
104 |
11.7 |
asin |
6 |
5.3 |
0.7 |
728 |
13.2 |
atan |
46.86 |
41.37 |
5.49 |
5616 |
13.2 |
cosh |
4.43 |
3.89 |
0.54 |
536 |
13.9 |
sinh |
3.37 |
2.96 |
0.41 |
407 |
13.9 |
cos |
91.16 |
80.67 |
11.49 |
10783 |
14.24 |
sin |
88.91 |
78.65 |
10.26 |
10526 |
13 |
tan |
47.75 |
42.19 |
5.56 |
5716 |
13.2 |
log |
9.14 |
8.07 |
1.07 |
1109 |
13.3 |
log10 |
0.43 |
0.36 |
0.07 |
52 |
19.4 |
exp |
22.85 |
20.17 |
2.68 |
2757 |
13.3 |
pow |
167.92 |
148.27 |
19.65 |
10000 |
13.43 |
The time taken by each of them is a result of three actions.
matrix creation + conversion from hex2dec + elementary function's operation time.
My understanding is that a majority of this time is taken by the 2nd step, but assuming this time to be same for different versions, the time difference is a result of elementary functions' operation time.
Intermediate Reports
https://scilab.gitlab.io/legacy_wiki/Contributor-APEF-Detail/report-2011-06-04
https://scilab.gitlab.io/legacy_wiki/Contributor-APEF-Detail/report-2011-06-17
https://scilab.gitlab.io/legacy_wiki/Contributor-APEF-Detail/report-2011-07-01
https://scilab.gitlab.io/legacy_wiki/Contributor-APEF-Detail/report-2011-07-15
https://scilab.gitlab.io/legacy_wiki/Contributor-APEF-Detail/report-2011-07-29
https://scilab.gitlab.io/legacy_wiki/Contributor-APEF-Detail/report-2011-08-21