The CCR, or Concurrency and Coordination Runtime, first made its debut in .NET Robotics. Since then a number of companies have adopted CCR for a wide variety of non-Robotic projects including the website MySpace.
CCR uses message passing semantics to isolate different parts of the application. At the lower levels, carefully managed threads and work-stealing queues have a significantly lower overhead than traditional threads and locks.
MySpace is using CCR for their communication layer, which is a custom wire format on top of raw sockets. Each server has two queues; the first is for on-way messages, the second for synchronous messages that expect an immediate response. They can tune each server by altering the number of threads assigned to the synchronous message queue vs. the one-way queue.
Beyond the communication layer is their caching clusters. Each contains a storage component and a collection of processing components. When a one-way message is received, it goes into a third CCR queue. From there is passed along to the storage component and any processing components that are subscribed to that message type. Synchronous messages are handled in a similar fashion, though the skip the aforementioned queue.
MySpace is using CCR because it offers them significantly higher throughput than what they obtained from thread pools. One reason for this is that .NET thread pools have a significant amount of overhead just to deal with trying to determine how many threads to keep alive. There is also the context switching between threads that consume valuable processing time. CCR avoids a lot of this by using defaulting to a single thread per CPU. In order to get even more control over threads, MySpace has extended CCR so that they can dynamically change the number of threads assigned to a queue at run time.
In addition to the thousands of incoming messages a cluster receives each second, it also has to handle out-going messages. For this they use CCR’s multiple receive and other patterns to make decisions like “wait for 100 messages or 500 milliseconds”, at which time the outgoing messages are sent in a batch. According to MySpace, these patterns are essential for moving large amounts of data.
Prior to adopting CCR, MySpace would either use .NET thread pools or manually manage their threading. They claim to be unaware of any alternatives for thread management in .NET prior to CCR, which happened to advertise support for the same use cases they were trying to solve with their in-house implementations.
Currently MySpace is using 1,200 middle-tier servers and 3,000 web servers running CCR. It has become so popular at MySpace that they incorporated into their frameworks and many developers there are not even aware they are using it.
You can watch the entire MySpace interview with Erik Nelson and Akash Patel on Channel 9. For more information on CCR, check out the CCR and DSS Toolkit 2008 R2.