CUDA implemantation of numerical algorithms
In order to benefit from recent technologies of GPGPU that allow significant speedup in numerical computation, this proposal aims at porting the main SciLab algorithms (LU decomposition, FFT, eigenvalues computation...) to CUDA architecture, so that the software can use the GPU to do its computation.
Various details : -A new type will be introduced, "GPU Matrix", used for optimised interaction with the GPU.
- As there will be a new type, the "elementary" operation will be implanted :
- toGPU (transfer matrix from host memory to GPU memory)
- fromGPU(transfer from GPU to host)
So the user will be able to build matrix from operation. This should take ~a month
- Following functions in scilab will be "ported", prefixed with gpu- :
That means a new toolbox will ship with functions gpulu, gpuinv, gpudet, gpuqr...
Extended use of the cublas library (proved to work with SciLab at the moment of writing), copy/paste of Fortran code with workaround when necessary (for instance, it's not possible to use value from gpu memory straight forward). As Cublas, Fortran and SciLab work with Column-Major Matrix, it will make the task a little easier.
- Some functions from FFTW library will be ported, some investigation need to be done at the moment of writing to make a planning.
Limitations (at least for the first release) :
It will only support one GPU per SciLab process* (ie if you have 2 GPU, only one will be used if you launch a lone instance of SciLab, and both of them will be used if you launch 2 time SciLab). As double is the "core" precision for SciLab, every function will work at double precision. The targeted required CUDA version is 1.1 (G92, ie from GeForce 8400, with noticeable exception of 8800 GTS and 8800 GTX). No special optimisation for integrated chipset (ie no ZeroCopy feature).
Student time line
See here for the Google Summer of Code planning.
*If everything work as planned...