Contributor - Scilab UTF-8 Internal storage

Description

Currently Scilab 6 heavily use wchar_t* to store strings internally and across modules. Using utf-8 encoded char* will slightly reduce memory usage on the average case. The works has already been started on a dedicated branch but need more work to cover all the modules.

Ideas

Rationale : http://lucumr.pocoo.org/2014/1/9/ucs-vs-utf8/

Using an UTF-8 encoding scheme will avoid converting string when calling C style API (which usually use char*). This will lead to a reduced memory consumption at parsing time as all expression (AST node) are also stored in std::wstring. At execution time, String values will consume less memory but with potentially more cpu-intensive operations on string (eg. code unit vs byte position).

Note : wchar_t is a 32bit value on Linux systems ; on Windows using wchar_t is more acceptable as it is only mapped to a 16bit value.

Contributor - Scilab UTF-8 Internal storage

Description

Ideas

Links

More information on other languages