Ruby 1.9.3 introduced the Lazy Sweep Garbage Collector written by Narihiro Nakamura (InfoQ reported), which significantly reduced the worst case time of the GC. In a recent proposal, Narihiro implemented the copy-on-write friendly Bitmap Marking GC (or bmap in short), similar to Ruby Enterprise Edition's copy-on-write-friendly GC. The memory reduction comes from POSIX's fork system call, which shares a child-process's memory with its parent and only copies it when its modified. Unfortunately, current GCs for Ruby don't play well with this:
Ruby uses mark and sweep garbage collection. First it goes through all objects and marks every object which is in use. This is done by setting the FL_MARK flag on the aforementioned field. Then it goes through all objects again, and frees every object which is not marked. The problem was that this breaks copy-on-write semantics: it effectively marks nearly all pages as dirty.
InfoQ asked Narihiro how his bmap implementation improves over the current Lazy Sweep GC:
The good points of the Bitmap Marking algorithm are as follows:In CRuby, CoW friendliness is most important. Bitmap Marking improves memory usage on programs that are using fork() in Linux. We have to use fork() when we need a real parallel performance in CRuby. And, we already have many libraries that are using fork() (e.g. Unicorn, Resque).
- A bitmap stores marks much more densely than if a object header has a mark bit.
- High locality.
- Marking will not modify any object. And sweeping will not modify any live object.
- CoW friendly, Few dirty cache lines.
- We use memset() to clear mark bits.
- Sweep is a little bit fast.
InfoQ: Lazy sweep GC decreases the throughput, and bmap is also a bit slower than that. Will bmap replace the current GC or can the user/developer configure which one to run?
My plan is that Bitmap Marking GC will be the default GC. You said "bmap is also a bit slower than that". Your understanding is correct, but I think it's in an acceptable range for everybody. So, I guess users don't need some sort of configurations for Bitmap Marking.
InfoQ: You also mentioned that you're going to work on Parallel Marking GC with Bitmap Marking. This sounds like it's going to speed up GC considerably, do you already have an idea how much faster (or how much pause time will be reduced) this will be?
Actually, I already created a Parallel Marking GC without Bitmap Marking. In some case, I managed to improve all GC time by roughly 40% on 2 core CPU machine. In Parallel Marking GC with Bitmap Marking, all GC time improvement would fall slightly.
I talked about the details of Parallel Marking GC at RubyConf US 2011 (video and slides).
Matz committed the Bitmap Marking GC patch to Ruby trunk, so it will be part of the next release, which will probably be 2.0.