Larry O'Brien questions the assumption that multi-core processors and languages that can leverage them will necessarily lead to performance gains.
The theory is simple. The lack of side effects in functional programming techniques naturally lends themselves to parallelism. Of particular interest of late is the Map function, in which a function is applied to each element in an array.
The optimist sees this and says "Ah hah! The compiler can simply distribute these calculations to a thread-pool and have a performance advantage on a manycore machine." And this is true if (a) f is quite lengthy or (b) the array is quite large. Otherwise, the overhead of distributing the calculation across cores / processors can very well be greater than performing the map "in core." In the worst case, when function and data are already inside the initial core's cache, the performance hit for distributing it would be very substantial.
As a historical comparison, Larry mentions the C/C++ inline keyword. He claims that for the most part it was a disaster. "But most developers do a poor job estimating the benefit of the inline keyword. Because, just as distributing map can be counter-productive, inlined code can decrease performance (the on-chip caches of modern processors make code size and data locality very important to performance)."
- Will languages that promise every call is distributed be sophisticated enough to overcome the performance issues of prematurely parallelizing?
- Will languages that require to programmer to specify when parallelization occurs be a disaster in the hands of mainstream programmers like inline?
- Is there a hybrid approach that will solve both these problems?