BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Ruby 1.9 adds Fibers for lightweight concurrency

Ruby 1.9 adds Fibers for lightweight concurrency

Threading in Ruby has been a topic of discussion for a long time. Whether future Ruby versions (1.9 and beyond) will use kernel threads instead of userspace threads, is still to be decided. Recently, another path for this set of problems has arrived in Ruby. David Flanagan points out a a new feature in the Ruby 1.9 branch called Fibers:
Here's how to use the new Fiber class (warning: class name may change) to generate an infinite sequence of Fibonacci numbers. I use "generate" in the sense of Python's generators. Ruby's new fibers are "semi-coroutines".

1  fib = Fiber.new do  
2 x, y = 0, 1
3 loop do
4 Fiber.yield y
5 x,y = y,x+y
6 end
7 end
8 20.times { puts fib.resume }
The code prints out the first 20 Fibonacci numbers. The concept it uses is called Coroutine. Basically, invoking Fiber.yield call stops ("suspends") execution of this code (don't confuse it with yield used for executing blocks). If you're familiar with debuggers, imagine hitting "suspend" on a thread or seeing the thread hit a breakpoint. fib is a handle to this Fiber, and can be used to manipulate it. The fib.resume call in line 8, does exactly what it says: it resumes the execution of the Fiber's code with the statement after the Fiber.yield call.

Line 4 shows Fiber.yield called with a parameter y. In a way, this can be thought of as similar to return y. The difference between a subroutine's return and a coroutine's Fiber.yield is what happens to the context of the code after the call. return means that, say, a function call's activation frame (or stack frame) is deallocated, which means all local variables are gone. With a Coroutine, yielding keeps the activation frame and all the data therein alive, so the code can use it when it'sresumed later.

Now it becomes clear how the code works: it's just a loop that iteratively calculates one Fibonacci number after the other. Once it's done with one, it suspends itself, giving someone else the control of the CPU. When that code wants the next number in the sequence, it simply resumes the Fibonacci code, which runs another iteration and then hands off control (and the next Fibonacci number) by suspending itself with Fiber.yield

This is a rather new addition to the Ruby 1.9 branch, and it seems that specifics aren't yet decided. The term Fiber might be familiar to Windows programmers. The MSDN entry explains them as such:
A fiber is a unit of execution that must be manually scheduled by the application. Fibers run in the context of the threads that schedule them. Each thread can schedule multiple fibers. In general, fibers do not provide advantages over a well-designed multithreaded application.However, using fibers can make it easier to port applications that were designed to schedule their own threads.
Sasada Koichi, developer of the Ruby 1.9 VM, formerly known as YARV, gives some more information on the ruby-core mailing list
These method names (resume/yield) are from Lua. "transfer" is from Modula-2. "double resume error" is from Python's generator. BTW, I'm thinking about name "Fiber". Current Fiber means Semi-Coroutine. Fiber::Core is Coroutine. Yes, name of Fiber is from Microsoft, but it's means Semi-Coroutine such as Lua's coroutine and Python's generator.
Semi-Coroutines are asymmetric Coroutines which are limited in their choice of transfer of control. Asymmetric Coroutines can only transfer control back to their caller, where Coroutines are free to transfer control to any other Coroutine, as long as they have a handle to it.
The example above shows how a Semi-Coroutine is used as a Generator, i.e. to conveniently generate Fibonacci numbers. Some languages, such as Python, support Generators in the language and have special syntax for it. From the quote, it seems that both Semi-Coroutine (Fiber) and Coroutine (Fiber::Core) behavior is supported. What will eventually show up in Ruby 1.9 and beyond and how it will be named remains to be seen, but Yukihiro Matsumoto, creator of the Ruby language, considers them safe:
It is still hot topic among core developers. But fibers (and external iterators) are likely to remain in the final 1.9, more likely than continuations.
Note: Continuations, a feature long absent in the Ruby 1.9 branch, were added to Ruby 1.9 in May despite concerns about whether they were feasible with Ruby 1.9's kernel threads.

Next to implementing control structures, Coroutines provide a way to use lightweight concurrency. In effect, they allow to implement userspace threads with cooperative scheduling. The Coroutines can either yield control to each other, or have centralized scheduling by handing off control to one scheduler Coroutine which then decides who gets scheduled next.

This can address concerns about Ruby 1.9's move to the more heavyweight kernel threads. Ruby 1.8 threads are built as a userspace threading system, which has the benefit of less thread management overhead. Creating a kernel thread involves a syscall to the OS, which takes more time than a single in-process call to a threading system. JRuby, for instance, uses kernel threads, but tries to offset the creation cost by using a thread pool.

Nevertheless, creating a lot of kernel threads still has a lot of overhead, or might simply cause problems on OSes that have hard thread limits or struggle with large numbers of threads. It's in these cases, when a lightweight alternative is useful. It allows the code to be split among  threads, if that is the logic, straightforward solution, but keeps the overhead down. Another advantage of the solution is that kernel threads are still available if a long running operation or syscall needs to be invoked but must not block the execution of all code in the process.

A similar approach is used in Erlang which also provides lightweight processes, with the difference that Erlang processes share nothing, whereas Fibers share an address space. However, the availability of Fibers allows Actor-style programming, without having to worry about overhead.

Fibers are also not absolutely new in the Ruby space. Rubinius has Tasks, which are described as similar to Ruby 1.9 Fibers. (InfoQ recently featured with Rubinius project lead Evan Phoenix on this threading in Rubinius). MenTaLguY details this on ruby-core:
In modern concurrency settings they [Fibers] are becoming increasingly useful, however. Without them, or something like them (e.g. Rubinius Tasks), you must play some very ugly games to get lightweight concurrency -- see the use of explicit continuation-passing (functions, not Continuations) in Scala's actors library for an example of the best that can be hoped for in their absence.
Granted, Fibers will make things harder for JRuby
This last comment, brings up an important point. If Fibers get adopted in Ruby, this will create headaches for Ruby implementations targeting the JVM or CLR, such as JRuby, XRuby, Ruby.NET or IronRuby. None of them currently support Continuations because manipulating or reading the callstack is hard or impossible to do in these VMs. The lack of Continuations is a controversial issue, but doesn't seem to have caused problems with e.g. JRuby, because they are not widely used in Ruby. The only use of them in the Ruby 1.8 standard library is the Generator implementation, but, for instance, JRuby 1.0, solved this by implementing the same functionality without using Continuations.

While it's certainly possible to implement these features using workarounds, the question is whether these workaround will cause performance regressions. If, for instance, call stacks must be emulated on the heap, instead of using the VM's stack, this can lead to lower performance or prevent (JIT) compiler optimizations from being applied. Workarounds for asymmetric Coroutines would be easier to do, as they could make use of the VM's stack for method invocations. Languages such as C# implement their Iterator feature, which allows to write Generators similar to the sample code above, this way.

Rate this Article

Adoption
Style

BT