Consider alternate environment object passing for speed
This would be a rather subversive change and nothing urgent, but it would be an interesting experimental project.
TLS/thread-local-storage access (or the pthread_getspecific(3) call if ECL was built with --with-__thread) incur some overhead.
The extra ecl_process_env() call in every function could potentially be replaced by a function argument containing the environment, i.e.:
void
ecl_foobar(cl_env_ptr *env, cl_index baz)
{
The stack usage shouldn't change much since env would be part of the function's local variables anyway to store the results of ecl_process_env().