BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Interviews Darach Ennis on CEP, Stream Processing, Messaging, OOP vs Functional Architecture

Darach Ennis on CEP, Stream Processing, Messaging, OOP vs Functional Architecture

Bookmarks
   

1. We're here at QCon London 2013 and I'm sitting here with Darach Ennis. So Darach, who are you?

Hi, I'm Darach Ennis. I'm the chief scientist at a small start-up called Push Technology. Essentially, we break all of the rules in traditional messaging systems to deliver high grade experiences to users in gambling, gaming, capital markets and streaming media and also in telecommunications.

   

2. Which rules do you break?

We break all of the rules in a traditional messaging system. They really haven't changed in terms of the architecture and structure in the past 20 or 30 years. They were designed before the internet, before an explosion of devices, before bring-your-own-device and the internet of things. So in those days, you didn't have a mobile phone that was portable. It was a large brick with a massive battery issued by someone like Motorola. Today, you probably have three or four devices in your pocket. You've got two mobile phones, maybe a Nook, an iPad, some kind of tablet device, a laptop or two. And this explosion of devices and connections also kind of means that you probably have a continuous interaction or relationship or conversation with the systems important to you. You don’t check email. It tells you when you've got an important email. It sends you a notification.

So you don’t need to be permanently connected. So if you want to monitor a Twitter feed, you've probably opted to get SMS text messaging when a certain person tweets. The same goes for work. We typically like to get notified of things as they change rather than have to poll and track. So the web, the devices and our interaction with these systems is moving away from HTTP GET and a kind of paginated delivery of documents towards a continuous interaction with systems. So I think messaging based on old school principles needs to change to kind of deliver better experiences with that in mind and that's something that Push Technology does and it's something I find interesting to apply some of my kind of past experiences too.

   

3. How do you push your events or what do you push? Do you push messages? What level do you push at?

Good question. So essentially, we deviate from messaging in terms of we don’t send opaque blobs between systems. The server, the engine in our product actually looks inside the message and goes if you connect and you subscribe to a topic of interest, we'll send you a snapshot of the current state of that topic. So we're stateful. You don’t have to be stateless. You can store information inside the messaging system. So I'll send you a snapshot. So if I have a list of things like a shopping list, I can send you the shopping list when you connect. If that changes, so you're halfway down to the shops and I want you to add a bunch of tomatoes or a pack of potatoes because obviously you're Irish or whatever and you want to make a Cornish Pasty and you say, "I need some flour as well," you might want to append that to the shopping list.

You might find when you open the fridge to go and make something that you already have milk so you can remove that from the order. Don’t send a new list. A messaging system might actually have to send you a new list because it doesn’t introspect the data types. It sends you a data structure or it sends you an opaque blob. Whereas we actually reflect the changes through to you so that you can stay consistent with the current view of the world, but we can use significantly less bandwidth. So we apply lots of other changes or breaking changes with traditional messaging assumptions. But we do so so that you get a better experience, better utilization of the limited resources you have on a 3G connection. So we try to maximize your experience with the systems you're interacting with.

   

4. The product you're talking about or the technology you're talking about, what is it? Is it a server? Is it a client? Is it both?

So it's a platform, it lets you deploy onto your own system and the cloud, on your own tin, and it's a set of client APIs in the browser on mobile phones and Android and Blackberry, Objective-C for the iPhone that you would use to interact with those services. So although we use things like Web Sockets if you have a modern browser such as Chrome, we also work with legacy environments. So if you're still kind of dandering along with Internet Explorer 6, we will still use iFrames to get data to you. So we will cascade down to give you as good an experience as we can given the limitations of the connection that you've got. And that's really where we spend most of our effort is actually looking at the grade of the connection, the quality of service and trying to deliver some kind of set of seamless interactions within the bounds of the limited fidelity, connectivity, latency that is available to us. So that's what we do and that's what we do well.

   

5. It's essentially Comet++ in a way?

You can call it Comet++. I like to think of it as more of a stream-oriented view of messaging. So we're very much focused on the fact that people are interested in the continuous update to things that are continuously changing. So as data structures are changing, we're reflecting the changes through to those clients. So it's a very different perspective than sending opaque blobs of things, snapshots of what has changed over time. We're keeping you up-to-date with what has changed in consistent ways. We can do so at very large scales so we can host up to 30,000 or 40,000 connections on a typical box that you want to deploy into our production environment.

We're used in very high frequency environments so online gambling and gaming systems, inside capital markets where we take a lot of their internal services and streams such as prices, bets and we make them available to clients such as mobile phones. The difference I guess is you can be connected over a 3G network on a mobile phone or over the corporate LAN and what you will see if you eyeball side by side these two systems are basically the same prices for each one. We go to great efforts to kind of deliver current consistent information. So we've got a lot of techniques we can use to remove the stale information that's typically blocking up the internet with things like bufferbloat. We've made changes to how our messaging system works.

So essentially, by virtualizing the client side queuing and moving it to the server side, it's very good for outgoing messaging because we can detect when queues are backing up. And that could be your specific queue on your device. So if you're on a particularly slow connection, we would actually work more aggressively to conflate down the information before we send it over the wire to you. So you save that data and the data you do get at a rate you can accept is always as up-to-date as possible. So it's a win-win situation. Of course, for every trade-off there's a nuance and the nuance here is this makes guaranteed delivery, this makes precise delivery a lot more difficult. Luckily, in most real world systems, the high frequency data are things like price information. It's generally recoverable. If you lose the price right now because you went under a bridge and lost your connection, when you come back out on the other side and you begin to recover, you don’t care about the old prices; you care about the price right now. So we can send you new information and that's fine. The same doesn’t hold true for orders, but luckily you probably aren’t going to send a high frequency of orders on a per second basis so it turns out not to be an issue. You use guaranteed delivery for that or use acknowledgments and we provide that as well. So in most cases the trade-offs we make are sensible given either the frequency of delivery for the system that our clients are building.

   

6. I've seen talks by you on various technologies to process events and treat events like, Microsoft has the Rx system that allows you to abstract over streams. So can you tell us about what you did, what work you did there?

So I spent a number of years working in the complex event processing community, so building algorithmic trading platforms based on essentially a streaming calculus, if you will, abstracted into an engine that gives you a SQL-like language for processing continuous streams of events. It's a lot more in-depth or algorithmic than my current kind of job which is really looking at the quality of messaging or data delivery between kind of systems connected over the internet. But a lot f the principles are actually shared in common. So in the systems I used to build I would take windows or streams of data and then process results continuously. So I'm essentially coming from a point of view or a perspective that most things can be processed or computed incrementally so you amortize the cost of the computations over time without really noticing it so you're soaking up the cost continuously so the cost that any one point of time is negligible but the aggregate value is delivered continuously. A lot of systems are still being designed in a very batch-oriented way, and you could be batching on a second by second basis or batching it in terms of hours or weeks or months for some reporting systems or access to a data warehouse like Hadoop. And that's perfectly fine. You've already stored the data so if you want to run some new query, you need to access the data. So Hadoop is very good for those kinds of use cases. If you have a query that you're running continuously and it's very mission-critical to your environment, we should be taking these batch queries and rather than running them at the weekend, running them continuously so that you can react to changing thresholds or changing environments, situations continuously.

So I come from this kind of real-time world where you're interested in making decisions right now, should I buy or sell or should I hold on to my position? Should I alter it in some way? Should I make a trade? Should I make a bet? Whereas a lot of the systems that most people are familiar with that most developers work with are fine with eventual consistency. So I can, for example, as long as I can add my items to a shopping cart and eventually they arrive over the post and I get the items that I originally ordered, I'm perfectly happy. It's consistent with the expectation if I buy three or four books on Amazon and I receive three or four books. Of course, Amazon doesn’t just sell books. It sells services now. It's a much wider company than what it started out to be, but they realized through automation that there were some values in what they were doing to deliver physical product out to remote and disparate environments that they could do that for services as well because they had to build this automated infrastructure to make that work.

I think the same holds true for many of the systems that we're building today and that as our relationship with devices and systems and services move from occasional weekly, monthly or second-by-second basis to a continuous relationship, a conversation that we're starting to move towards stream-oriented thinking. So I've kind of been doing that for a few years and I'm applying that right now. So I probably seem weird and different to most regular developers.

   

7. How do you explain your approaches to these developers? I can understand batch processing; I have a big list and I run over it in a for loop. How do you work?

So if I was processing this big batch to give it to you or you'd give that to me a page at a time and I wouldn’t worry about the batch at all. I just worry about each page as it arrives. I would process it. So think of it and taking that batch and breaking it into a bunch of smaller batches. If you break the bunch of smaller batches then into a logical atomic indivisible unit, you have an individual event or message as the case may be. So for me these messages are actually discrete events, and I look at aggregations or compound effect in terms of calculating interesting results for some notion of interesting in real time continuously. So I might want to as people are buying things in my set of shops and my point of sales systems, I want to monitor my stock in real time so that I know that I need to get new stockings for some shop in London or I need to buy new hats for a shop in Manchester, for some strange reason there's a rush on hats. Go figure.

So you want to get the right product into the right places at the right time because no one has an infinite amount of warehouse space. Now the manufacturing community which is a long time in my past where I worked, I worked in just time manufacturing environment, we didn’t have enough floor space to store the parts for the products that we were making. So I guess I got used at quite a young age to managing things like resources and capacity planning because it's kind of baked into me in terms of Six Sigma and other punishments like SEI CMM Level 5 developers that I luckily don’t have to worry about anything like this because we're all Agile and Agile is awesome.

So I think mainly for developers today, try to just show that, yeah, we've made a lot of mistakes over the years and some of the things that we've done are actually really, really good. It's just now we need something totally different like our needs have changed. No one really likes email anymore. You want a continuous update. As we're using social systems that seem to have a time roll or this kind of continuous timeline that's updating, we expect all of the systems to be designed like a continuously updating system. The tolerance for receiving some old stale result that is not up-to-date on a periodic polled basis is actually reducing as people become more used to having those continuous feeds streamed around them. And I think as that continues, as devices begin to proliferate that that problem becomes one that's more and more interesting to solve looking forward.

   

8. Moving down the abstraction level, are there any systems or languages that are better suited to this event processing than others? I mean if I look at InfoQ, where this is going to be published, is more of a Java space and Java is synchronous and multi-threaded whereas languages like Erlang or JavaScript focus on asynchronicity, are they better suited to this or does it not matter?

I think that's an excellent question, and I'm trying to answer that question myself so I can't give you an answer. I can tell you what I found out. So I've earned most of my money over the past ten years from implementing systems in Java using the Java ecosystem. I enjoy working in Erlang the most in my own time. So when I have a problem to solve, I solve it in Erlang first and then I write it in the system, the language that I'm paid to write code in which is invariably Java 90% of the time. I also like some newer languages and techniques. So the functional community, for example, Erlang being one of my favourite languages, I've tried functional reactive programming. I like what Haskell have done. I really like what the Node.js community have done by diverging from most other environments and languages by insisting on asynchronous non-blocking APIs and libraries out of the box.

Now I don’t really have a lot of experience with JavaScript. I'm a bit of a newbie when it comes to kind of web-based technologies although I currently work in that space. But I really like that community because they're willing to think different. They're solving problems by building asynchronous APIs and getting into trouble that not other language has really gotten into yet. There are libraries that I think are relevant. There's a Ben Christensen from Netflix gave an excellent talk earlier today on functional reactive programming in Netflix and how their use of functional reactive programming in their APIs is now leaking back into their backend systems because it's a better way of building robust highly changeable reliable systems.

And I think that is going to become more prevalent. I don’t think object-oriented techniques are going to survive much beyond macro level architectures, beyond the shell of a service-oriented system to make it modular and manageable. I think inside we'll be using more domain specific languages, probably things like Reactive Extensions in .NET which now has Java versions -- thanks to Netflix -- JavaScript versions. I think these kind of domain-specific systems that allow to think in terms of streams and flows or continuous events of data in terms of those continuations or streams or that continuing interaction will be much more prevalent because to do that in a batch imperative style just doesn’t make sense. It's streaming data. It's always changing. So you want to process that. You want to operate on that from the design perspective of recognizing it as continuously changing data.

And there are techniques out there are very successful in certain niche domains particularly capital markets that have been solving this problem for the past 10, 15 years and that's the complex event processing community. To some extent the distributed stream processing community have been solving all of these problems. Unfortunately, they call it things like "complex" and no developer wants to deal with "complex" things. So the best way to socialize the values of complex event processing is just try and make it as simple as it can possibly be. Maybe remove some of the really clever things that no one actually really uses when you're implementing real world algorithms. So I'm looking at the 20% that I find 80% useful, just focusing on that. And I think if I can socialize some of the values of having small little kind of single purpose libraries to deal with, dealing with windows of things in the right way and re-evaluating that is probably a good way to kick things rolling and get people thinking. It seems to have worked because I'm having lots of interesting discussions with Matt Podowski from Microsoft who works on the .NET team. In fact, he talked about one of my open source projects, SDC. He's talking about it now. So I've had interesting discussions with the guys in the Play Framework. I really like what they've done with Iteratees. So with Ben Christensen, what he's doing with kind of Rx for Java with Observables, I think there's a lot of movement in various communities converging towards a stream-oriented view of processing.

And because we're dealing with more streaming systems, probably Twitter is having a large effect on this because we're now continuously connected. We want results right now. And data distribution getting high grade experiences with data is just a small part of how I interact with streaming systems from a work perspective, but I think now it's pervading all of our lives. I mean when is the last time someone reminded you of something using an email if they already have a connection with you on Twitter. You probably get a direct message on Twitter, right? I forgot to give John Davies my slides earlier today. I got a direct message on Twitter. I didn’t actually get an email. "Oh, could you just give me a copy of your slides?" No, I got a direct message. So I think that the writing is on the wall, things are moving towards continuous streamed delivery. I think we're going to see some innovation around there in terms of the near real time aspects that we've already seen lots of innovation in the database community that have already moved towards eventual consistency for similar kind of reasoning, in this case for scale. Because if you do have something like an Amazon shopping cart, you don’t need the results right now. You're not going to get the product for a while so it's okay for that to partially fail and then complete successfully at some point later. It's okay for that to be a protracted asynchronous experience. It doesn’t have to be right now. But humans are different. We don’t like inconsistencies. We don’t like it when a system or a service fails and fails obviously and interrupts our flow, interrupts our thinking. We don’t like having to be interrupted to deal with something and then picking it up later. For when situations inevitably occur we want that transition to be as smooth as possible.

So I think systems will need to be adapted to the way the humans sitting in front are actually thinking or interacting. How that is going to work, I don’t know because our interactions are changing. Our needs are changing. So certainly, the solutions we have today is wrong for what we need in five, ten years time. My children who are under ten just have an organic natural relationship with computers. It's pretty obvious how to use a modern smartphone and flick through photos. You just flick through the photos as if it was a physical interaction and it just works. My one-year-old understands flicking through a set of photos on a device. My father still can't use a computer. He's an electrical engineer. He still hunts and pecks. He just missed the boat. So I think changing generations will force through changes as well. So for us oldening engineers we need to keep up with the young guys now and make sure we water our I's and cross our T's because they will actually have the ideas that are going to move things forward because they're growing up from a different experience.

   

9. And so basically they should watch terms like CEP, complex event processing, or stream processing?

I think every new generation can learn something from an older generation. I think what older generations need to learn is how to accept that newer generations have a different starting point. It's a different experience of them. The world is a different place. So the solutions that were right for us aren’t necessarily right for them. Maybe it was okay back in the '50s and '60s to consider transactions in terms of asset guarantees. Andeventually consistent view of a transaction is perfectly fine. I know from my younger experiences in a manufacturing system, I used to get woken up at 4:00 in the morning to drive ten miles into a manufacturing plant because there was a pending transaction in the database. So yeah, transactions are atomic, consistent, isolated and durable when they work. When they fail, they hold up the manufacturing line and I have to get out of bed and drive to a manufacturing plant to figure it out, to resolve the conflict and to get it going again. It was a manual process back then and this is only 10, 15 years ago.

Really interesting database environments like Riak, for example, from Basho and others in the NoSQL community are trying to fix this automatically. So I think CRDT research by Inria in France is being now applied by companies like Basho and others. It's very, very interesting in the database community. I think the web scale the rest of us actually starting to need to thinking how do we apply that to the systems we're building and the experiences we're offering. We're still thinking in terms of AJAX and Comet, and no one basically no one cares anymore so we need to move on.

   

10. You mentioned OOP and that it might be a model of the past. I'll stick up for the OOP guys now which is weird, but isn't OOP, essentially, the old style of OOP, isn’t it all just about messaging and sending messages to objects? Isn't it basically where it came from and then it got perverted but...

I don’t know. Is it? Isn't it more about encapsulating state than it is about sending messages between objects? I think you can use that to send messages between instances of objects, but I don’t think the intent was really about objects. The intent was more of a structural thing. It was a way of structuring systems for reuse. Did we reuse any of it? I don’t think so. So that major point I think is fairly moot. I just don’t think humans are good at collecting reusable artefacts and then actually reusing them. Look at the waste we generate in our regular lives. So I think having an understanding of that, I'm thinking of reuse and recycling in a different way or thinking of waste management maybe more importantly is something that we can get real tractable benefit from.

So in complex event processing, for example, the first thing you do in a financial environment where everything is time sensitive and where there is a time value to the data that you're processing, there is a currency, if you will, as that what you don’t process, costs nothing. There is no cheaper performance optimization than doing nothing with something. By throwing it away, you're saving yourself a lot of effort. Not a lot of people think like that. What can I avoid doing? What can I avoid processing and how can I avoid it? And I think waste management as just probably much more important as our thinking til today than a lot of the other things maybe that we've discussed because in most environments not a lot of effort goes into thinking, what can we avoid? What can we not do? Can we pare this down to something really simple and get away with it?

I tend to go to those kind of thinking points first and see what can I get away with? And a really big side benefit to this is invariably I oversimplify and get the solution wrong and that forces me to fix it for the right reasons. So we can evolve quickly at a fairly reasonable pace because it's evolving for the right reasons, and there are a set of measurements whether they're positive, a test or a benchmark or are they negative? I got it wrong and I receive a bug. That's still positive feedback. It's still an opportunity for the next iteration or version to be better than the one that happened before. So I think that kind of thinking is more a stream-oriented way of thinking about production or development or programming. So I guess once you've hit the continuous slippery slope, that's it. You're in a timeline for pretty much everything that you do. Batch becomes something that is the way other people think.

Werner: So recently you work with Erlang or you like to work with Erlang which is a functional language and there's this question about can we program large architectures with functional programming. It seems to me that event processing, as you mentioned, works on snapshots and then you apply changes to it which seems to be the functional model.

The product I work on, so snapshot and delta you could consider in a functional style. So that's more of a data distribution thing. Event processing is more the operations that you perform on that data as a stream or maybe how you optimize that for given interests. So for example, if I'm sending data to you and you're on a mobile phone, there's no point in me sending you 10 or 100 megabytes per second. You're on a mobile phone. You're on a 3G network. You can't possibly receive that flood of information clogging up the pipes. So I should probably just drop that down to a rate that will be acceptable to you. As a human -- I'm assuming you're human, not some alien from outer space -- you can only recognize, you can only process maybe to a resolution of 40 milliseconds when you detect a change in your visual landscape.

You can't focus on anything bigger than the thumbnail on the end of your thumb when it's fully extended. So if I look at you like this, your nose is in focus. Everything else is out of focus. People don’t realize that, that the flood of data I'm sending you can't possibly be processed by you anyway unless the user interface is actually optimized to deliver information to you. So we can produce and generate data at a much higher rate than we can actually process and make decisions upon. Maybe we just shouldn't send that much data. I don’t think it's that crazy a proposition. I think it's actually quite sensible but I still see a lot of folks saying, "Well, you should just use messaging over the internet." Why? You're just going to send me messages at the rate that the system is generating. It's not intelligent. Shouldn't we maybe think about only sending it to a human endpoint at a rate that a human endpoint can actually possibly process? Now if you're a very advanced gamer and it's your favourite game, say it's Halo, for example, maybe you'll have a visual acuity at certain parts of the game that you know very well and an ability to respond that's finer grained thatn 40 milliseconds. Maybe it's 20 milliseconds. So maybe I should send you double the rate of data or should I send you full flood of information? No. And if that logic is right for humans, can that logic then also be applied to systems? Do I really need to send you that much information? Do you need to be that up-to-date? I think the answer is for a monitoring system, probably not. If it's got some manual element to the response, definitely not. So why would I send it at a rate higher than it can possibly be handled? That doesn’t seem to make sense. And that kind of thinking is baked into functional reactive programming. If you have tools, that you can use to make sensible decisions on how to manage/conflate/merge/processed data whereas an object oriented system just gives you a morass of related objects and maybe if you're lucky a staged event driven architecture.

But I think the staged event driven architecture moves you into the state where you start to think about flow and streams again. So a stage event driven architecture is a poor man's way of thinking about flow and streaming. It's better off just to give in and start thinking about flow and streaming. And the people who gave in first was really the complex event processing community. And before them, it was actually the security guys and processing log files looking for kind of attack events and looking for evidence. So they actually started processing log files. So some of the formative experiences developed into event processing came from the security community which is interesting.

So I think that kind of stream-oriented thinking is becoming more prevalent. If you look at Iteratees in the Play Framework, functional reactive programming which is spreading like wildfire. So it started out inside .NET and C#. Well, now we're seeing it in Java and JavaScript and other environments. So think these are all related. It's all related to the fact that it's becoming more natural for us to deal with asynchronous systems that are non-blocking and just getting on with things. It's no longer acceptable to just poll or to wait because they introduce unnatural delays. So we're learning how to move beyond those unnatural ways and work around them. And I think that is something that will become a lot more common.

   

11. I think to wrap up, I think you have a few projects that you might want to talk about?

I have two fairly harebrained experiments running up on GitHub. One is called EEP which is a very scary event processing system because -- it's not actually complex. It's really simple. I call that 80% to 90% of the valuable things in a CEP engine that you'll never use. So it's a very simple interpretation of an event processing engine. Currently, there is someone in Germany building a Clojure implementation of it. I've done Erlang and Node.js and JavaScript. I'm going to do a Java one. Ian Barber from Google did a PHP one. There are different ideas coming out of that. People seem to approach it in different languages in different ways. So I implemented kind of Java last night and I did it differently to Node.js. So both are kind of OO-ey. You've got generics in Java so it forced me down a different path. I think that evolutionary divergence is actually quite interesting. It forced me down a very different way. I structured the code differently. Go figure. Because I was really oriented towards implementing the patterns, the sliding windows and tumbling windows which are the things inside this simplified event processing engine.

I have another one that I call Beam which is kind of a streams and pipes thing. It's an interpretation of everything else that you get inside an event processing engine, but it's more simple interpretation. It's the simplest, most trivial implementation of a CEP engine I could build. And basically, it just does pipelines, branching and combining of things and you plug in everything else as an operator. That was less than 150 lines of code. So it doesn't do anything. It's not very useful. Kind of like Haskell except it's less useful than Haskell, a lot less and I'm not as intelligent as a lot of the Haskell crew. But that seems to be getting now some interest as well. Mainly because of everything I left out is I think why people like it. It's embeddable. They can change it and adapt it to their needs. So they're more like social experiments than anything else. It's on Github, have a play and have some fun.

   

12. Do they draw inspiration from Rx or from other sources?

I actually recommend that if you're using .NET, just use Reactive Extensions in production. It's awesome. If you're in Java, it looks like RxJava is something I need to check out for my own actual needs. But in terms of interest and what I think is interesting, being able to do a sliding window faster than anyone else is something that's interesting. And the way commercial event processing engines do is actually wrong. It's got exponentially degrading performance. As the sizes of these streams of windows or processing have increases, it's exponentially degrading. So if the algorithm you're running inside that window, the computation is exponentially degrading, it's not going to work for windows that have a large N, a large size.

So I've been trying to look at fixing some of these issues. So I'm more looking at what goes into getting a good sliding window and saying what new properties I can discover one or two I've discovered. So you can actually do this in pretty much amortized linear time but only for a certain type of functions or calculations. So there's a physicist in Europe and one in Waterford in Ireland who are quite interested in that section of this and I'm learning a lot there because I missed one of the properties, for example, which I'm taking about at my Erlang talk at the airline newsgroup at QCon tomorrow. So that to me is interesting is how there's a lot of clever people out there. And if you do re-evaluate something, someone will find some aspect to it that you missed and then it's just an experiment. It's fun.

Werner: Well, it means we have a lot to check out and thank you, Darach.

You're very welcome. It's my pleasure.

May 09, 2013

BT