Dataflow Blocks are the backbone of the .NET 4.5’s new high performance parallel processing library. And while they offer a lot of functionality out of the box, there will be times when a custom block is necessary. Zlatko Michailov has put together a document outlining the process and many of the traps you may encounter. The full guide, Guide to Implementing Custom TPL Dataflow Blocks, is available on the Parallel Programming with .NET blog so we’ll just hit the highlights.
Before you begin, Zlatko asks you to consider if you just glue together an existing ITargetBlock and ISourceBlock. If so, the Encapsulate function will create a new IPropagatorBlock for you. This function handles most of the boilerplate code, but you still have to explicitly state how messages are propagated from the target to the source block.
For more control you can explicitly implement ITargetBlock and ISourceBlock. Unlike most abstract interfaces, not all methods are meant to be exposed on the implementing class’s public interface. Some methods such as LinkTo and Complete are meant to be called by general code, while others such as OfferMessage should only be called via the abstract interface from another block. Section 5.1 of the guide has the recommended visibility rules for each.
Zlatko then shows two detailed examples. The first is a synchronous filter block, the second a synchronous transformation block. While there is quite a bit of boilerplate code, it does illustrate a lot about how TPL Dataflow works internally.
The truly tricky code comes into play when Zlatko starts talking about the asynchronous block. Right from the beginning you have to start considering things such as a lock hierarchy. Zlatko recommends an approach used by the built-in blocks which involves an outgoing, an incoming, and a value lock.
Marking a block as completed seems like a simple thing; one simply needs to set the Completion property when Complete or Fault is invoked. But once you start working with asynchronous blocks even this can be tricky. For example, actually setting the property has to be done without holding a lock because it may trigger other synchronous code.
Another concern is whether to consume messages in a greedy or non-greedy fashion. If using a non-greedy block then additional steps must be taken to avoid collisions with other targets listening to the same source.
Finally Zlatko covers offering messages and linking targets. Fortunately this is much easier because all of the operations are intended to be synchronous.
For more information see the TPL Dataflow site on DevLabs.
InfoQ is currently looking for writers for our educational content section. If you know your way around TPL Dataflow and would be interested in writing a 4 to 6 page article on the subject contact Jonathan Allen at jonathan@infoq.com.