Contributor - LLVM
Description
The goal of this project is to convert the Scilab language into LLVM bitcode.
We want to use LLVM to Just-In-Time (JIT) compile some Scilab code (see [1] for and example of jit with LLVM). This should be a very rewarding approach because Scilab has many variables that are handled as matrices but are in fact 1 X 1 matrices that could be handled much more efficiently as simple numbers.
Note that this work must be done in the Scilab git branch called YaSp which will be the base of Scilab 6 as this branch features a clean API to access the Abstract Syntax Tree [2] of a Scilab program.
Amongst the challenges of a Scilab JIT, the following issues are noteworthy :
- Handling the dynamic typing of Scilab. Languages with dynamic typing such as lua (see [3] for lua jit with LLVM) are notoriously difficult to jit because the same code can have many different instantiations according to the types of the variables at runtime. Luckily, only jitting for matrices of doubles or complexes should be enough to cover 99.9% of the needs for performance with Scilab code.
- As-needed jit. The jit cost should only be incurred when it is less than the performance gain. Therefore, heuristics should be designed to guess with code is worth the jit compilation. It could even be possible to adjust the jit cost according to how much optimization is need (see [4] for a brief discussion on this subject).
Some help from the LLVM project could be provided in the scope of this project.
Ideas
The following strategy could be used :
- When entering a loop construct, one should guess if the loop will be taken enough times to justify the jiting effort.
- If the jiting effort is deemed worthwhile, we should then infer which of the variables will have types (and dimensions) for which jiting is possible (i.e. at first only doubles, then also complex numbers, vectors, matrices).
- We should then jit the expressions involving those variables. Extra care should be given to numerical accuracy (e.g. keep the same algorithms as in Scilab, see [5], forbid overaggressive optimizations that would not respect IEEE754, ...)
Links
[1] LLVM JIT tutorial example.
[2] Scilab new Abstract Syntax Tree API
[3] LLVM back-end for a dynamic type language (lua).
[4] discussion on the cost of LLVM JIT for python code.
[5] on the robustness of some Scilab numerical computations.