I work for a small consulting organization called "ThoughtWorks", who you may have heard of, and I do a lot of SOA and Web services work for them, particularly with an emphasis on dependable systems. Maybe it's because I am a pessimist, but I look for those kinds of situations when things go wrong and figure out ways of mitigating that kind of risk.
A lot of SOA projects I have seen, have been somewhat akin to mobilizing an army. You have hundreds of consultants, a whole bunch of armaments in the form of huge, sophisticated middleware platforms. The whole thing is very heavy-weight and cumbersome. I feel that when you are going for that kind of big, upfront SOA deployments, you lose a lot of opportunities to prioritize, to deliver business processes based on your business priorities and your business values. The Guerilla SOA aspect tries to turn that around a little, so we're looking for much more lightweight engagements, if you'd like in military terms. We want to address specific discrete business problems, organized by priorities according to the business stakeholder and get those processes implemented rapidly in an incremental way with lots of feedback. So we can actually start to prioritize across the business - which process is the most valuable, which ones are most heavily used and implement those first without having to wait for a big program of work to be established, to put the enterprise service bus in place or other kind of technical dependencies. So it is kind of almost a, tongue in cheek really, but a hit and run, deliver often, and incrementally kind of SOA option.
The nice thing is that it works in either because you have specific priorities that the business gives you at any given time and you focus on those, and as long as the business can keep coming to you and saying "I now need that this process implemented" then you can scale up ad infinitum until the point where you automated all of the processes of a given domain in a given business, so it scales fairly well.
I think the approach helps us the to decouple dependencies between what the business people want and the tools we have available. So although I poke fun at the ESBs and so on, I would absolutely use those tools where it makes sense to me to implement the process that I have been instructed to automate. If it doesn't make sense then I won't use them and I will use other tools. I will use anything from simple Java apps right the way through to full message broker, store&forward-based architectures where it makes sense within my current context. But the important thing is that I don't let my current development context bleed. I don't let those abstractions leak into other development projects.
We like to keep each process that we are implementing relatively isolated so that then the service ecosystem grows and can be reused. It has this emergent behavior that we never expected. On the other hand if we allow everything to bleed together in a big SOA platform you tend to get tight coupling and that restricts your options for evolution further down the line and it restricts your options for this kind of interesting emergent behavior, which me and you as geeks could but the business people couldn't see because they have this much broader view of processes as a whole.
5. So you mentioned tight coupling as a risk. Can you elaborate on that?
Sure. It is the classic scenario. If I've got two systems which are tightly bound, I change one I risk breaking the other. We saw this back in the day with CORBA applications where we tightly coupled through IDL and we see it today in Web services where we tightly couple through another IDL called WSDL. If we are sharing type systems and I want to change my type system in my program that can have a ripple-through effect which is going to hurt you. So when I come to you and say "I'm going to make changes" your first reaction is "No, because you're going to break me!" and then we get into this paralysis, where neither of us can make progress because we're so scared of damaging each other. Then you need strong governance and so on and someone to come with a strong arm and make both parties move. It's a reluctant high friction environment to be in and yet had we decided not to share technical abstractions at that level chances are that we'd be much freer to evolve and innovate locally without disturbing or breaking anyone else globally because the abstractions we use internally would be different to the abstractions we share with other services around the ecosystem.
6. So what would be an alternative to that approach?
An alternative to the approach of sharing type system, for example?
Instead of sharing types, I think we should start to share business messages, or schemas for business messages, owned not by the technical people, but by the business people. That gives me as a developer of a service an interface which I can make sure that I adhere to and honor a contract in my service implementation. You can also see that contract in your service implementation and you can understand that you're going to get these kinds of messages in and out. The point being at the technical abstractions that you are using to implement that type, you may have some interesting class hierarchy, are never exposed, so I can never bind to them so we never get coupled at that level. The coupling we have is just on the messages we depend on.
You look at my service's contract, you see the messages that come in and out at my service and somewhere deep in the bowels of my service, I kind of have ways of extracting the information and using it to do some processing and you so in your service. In between we have this very neutral integration domain, which is just the business messages as recognized by the business stakeholders. Another benefit of that is that the business stakeholders can tell you when you've got things right and when you've got things wrong, which is tremendously difficult if you are using lower level abstractions like the type system. Because the business guys know that this message used to get sent by fax from Sydney to London and they knew the semantics of that and if you can show them the same thing in your automated electronic workflows they can say: "Yes. That's right!" or perhaps even more valuable: "No you have got it wrong! Stop! Do it this way!" So you don't go off on a tangent building a solution to what you think is the problem, you build a solution to the actual problem.
8. So you mentioned services and messages as two abstractions? What about operations?
Operations are an abstraction which I do not believe exists in a service oriented architecture. They may well exist in your implementation of a service but that is nothing that I want to share with you. This is a technical detail which is my business inside my implementation. When I think about an SOA, I like to think about the notion of letter boxes. So all I can do is deliver a message to you and at some point you might open it, read it, think: "Yes, I understand what that message is." and then you will go away and process it - or not. If I send you a nonsense message you may be graceful enough to fault and tell me so, but literally we don't have any tight coupling in the form of an operation abstraction. I can't invoke you because for all I know you are in a 3rd party system in a different organization so I don't have that strength. You are not a local object to me, we don't have a call stack, I can't poke you. All I can do is request: "Could you possibly have a look at this message and maybe if it suits you do some processing on it", rather than the more tightly coupled operation abstraction.
Absolutely. And this is from a style of architecture which we called MEST or Message Exchange, which was a deliberate paying of respect to REST where some of our inspiration came from, in so far as this letterbox is a uniform interface through which we poke messages. If we'd map it onto HTTP it would be a POST. You can also use SMTP SEND or whatever else you choose. The message would contain 2 things: it would contain the business payload which is effectively the purchase order, the invoice, those kinds of things that business people process and it would contain some metadata, potentially contains some metadata or anyway, which sets the processing context for that payload. So it may set security context, it may set transaction context, that kind of thing. The MEST idea is that I'm delivering you a message; you are going to go away, set the context of processing that message, examine that message, find whether it makes sense, go away and process that message. End of story. At some point later a message comes into my letterbox, I open it and say: "Ok. That's from Stefan." And I know what this means.
It's actually correlated somehow, typically with WS-Addressing RelatesTo and so on, with that message I sent him earlier. Now I can go into my implementation and finish the processing I was doing, which originally caused the message to be sent to you. And that's a really nice decoupled way of doing things. I'm not binding to you directly; the only things I'm binding to in a technical sense are messages which are in my stack, in my process space, which is very safe to bind to, whereas if we go back to the operation abstraction, if I'm bound to you and invoking and for some reason you're down because the network is down or you're in a different company and the firewall rules suddenly got restrictive suddenly I break, I get this horrible "Internet timed out" exception or something meaningless; whereas if I'm just treating messages going up and down in my stack as the things I used to cause processing or things that I created as a side effect of processing, it's actually a robust pattern for implementing individual services as well as a nice decoupled scalable pattern for building up service ecosystems.
Sure. Commonalities are pretty obvious: uniform interface, so REST has five operations each resource implements; in MEST every service has one interface which is effectively poke a message in here. Differences are MEST is very much more akin to traditional MOM; it's about passing messages over some transport, whereas REST uses the hypermedia engine. They are kins because they both aim for large scalable systems, but whereas RESTful systems tend to look like the web, MESTy systems tend to look like TCP. Just make connection, post the message, close connection, that kind of thing. So there are similarities and I think both models have been proven out. Tongue in cheek, I'd say TCP happens to be slightly bigger than the web so maybe the METS solution is more scalable, but I'm not to upset the REST jihadists at this point.
Right. WSDL is an IDL. WSDL's abstractions are operations. It has some other drawbacks in so far as it's quite a verbose IDL. I think the difficulty comes when you start to get past "StockQuote" web services and you need to be able to have a conversation, a long-lived conversation with a service, which WSDL doesn't have the abstractions to support. The longest conversation you have with a WSDL-described service is requests/response. Some time ago this started to become quite a chafing limitation for me and some other guys, Savas Parastatidis and some of the guys working at CSIRO in Sydney, Australia; we decided we are going to do something about it. And this is when we wrote SSDL. SSDL has a spectrum of possibilities; at one end is just a less verbose replacement for WSDL 2, it's completely isomorphic to the capabilities that WSDL 2 gives you, and at the other end it's a superset on what's available in WS-CDL, WS-BPEL and WSDL.
So we are able to describe long-lived conversations between multiple web services in a structured way, in a way that we can verify that that conversation won't deadlock so we can put it through model checkers and so on; so we can actually get a whole lot of static analysis about how end-to-end systems are going to look and still support this notion of quite intricate conversations with a service. A typical message exchange pattern in SSDL might be two requests, followed by seven responses, followed by another request, an optional response, and three more requests. And we can build arbitrary conversation patterns in it, which is really good when you think that most web services are going to be used to host business processes and most business processes are workflows which have this kind of more chatty or conversational kind of interaction pattern which is really difficult to capture in WSDL being limited to requests and responses. So SSDL gives you this capability to describe workflows effectively which I think is going to be one of the sweet spots of the SOA web services going forward.
No. SSDL was an effort by some academic researchers and practitioners to see what a contract metadata language would look like, if we were freed from the tyranny of the operation abstraction. Right now it's been in the community for a couple of years, it's got some pretty good feedback, a lot of the web services guys know about it and have commented favorably about it, but it doesn't have the backing of any of the large vendors, although some of the people involved in it now work for large vendors and large research organizations, there is nothing official. Our hope originally was maybe we can just inspire some thinking in the vendors that are providing tools, so the vendors can give us tools that do this workflowy type stuff. And that's happening, there has been some discussion in the community, but now to keep momentum going the community itself has started to develop tools.
13. Are there any implementations of SSDL yet?
Yes. When we first released SSDL Savas Parastatidis of Microsoft had a simple SSDL tool that would do some basic contract generation, validation and so on. But more recently Patrick Fornasier of University of New South Wales in Sydney has built a complete SSDL stack on top of Windows communication foundation. Currently that stack - which is fabulous, it's a really neat piece of engineering; it looks like WCF, it behaves like WCF, so the programming experience is consistent and friendly and familiar – now currently it only implements the part of SSDL which looks like WSDL, but the framework is extensible enough so that you can then implement, which we believe to be the higher value aspects of SSDL, the pi-calculus base stuff that enables you to describe choreographies. Patrick has been kind enough to open source that and as of a couple of weeks ago there is now a SourceForge project where people can contribute and hopefully that toolkit will go on to become richer and richer and in my ultimate fantasy scenario it just becomes a de facto standard that people use when they are going to build WCF web services.
14. Any hope of a similar thing for Java yet?
It's something we have been thinking about (my colleagues at ThoughtWorks in Sydney, Josh Graham and those guys). What was meant to do it in fact when Indigo was being built we had the first skeleton of a project called Dingo a kind of tongue-in-cheek version of Indigo which we were going to make SSDL centric and that has languished a little because we would have day jobs to do. My hope is that Soya, which is the WCF SSDL tool kits starts to get some momentum folks on the Java side, maybe folks like Arjen [Poutsma] who have seen SSDL and have spoken favorably about it may just build up the Java aside of the stack or the guys in the Ruby community may just build up something on the Ruby side of the stack. So it's optimism, maybe unfounded.
I think as a developer, my day to day work is really constricted by the fact that the tools I'm given all reinforce the RPC mind set and I have to fight really quite hard to beat the RPC mindset down and try to make these current tools behave in a more messsagy way; over the years we've developed a bunch of patterns, for example, for using tools like Axis which are very RPC centric and being able to abstract away that there's RPC interface here and turn into a more message passing kind of system. I don't think most developers who are under the cosh will have the time necessarily or the inclination to do that and when the vendors come along and say: "Yes. Just take your EJBs, put them through this machine and out comes your WSDL and there is a SOA." I think they find that appealing because they have got a million other things they have to do, so I don't necessarily despair of the fact that we weren't ever moved toward a more asynchronous messaging environment. I think maybe we'll get burned a few times with some famous web services failures where RPC style implementation built out a system which is not very evolvable, which is high friction and so on, before people start to think: "Yes, I need better tooling." You see tentative efforts in this area, like the Spring Web Services stack for example. That's going to be a bit rude for a lot of developers but you can see the same kind of messagy ideas: "Here is a lump of XML, deal with it!" starting to percolate now into more mainstream frameworks so cautiously optimistic that it may not be a bleak RPC future.
Absolutely. And that's the beauty of it. You forget that it's there because it's just a means of transferring a message from a sender to a recipient with the implicit request or hope that the recipient would process that message in some meaningful way in his context. So if I could get away having a uniform interface which has zero logical operations I would be happy with that. Unfortunately my own mental crutch was I thought services implement an operation "process this message for me please" and it takes "message" as its parameter and it returns "message" as a result. So for me that was kind of a crutch.
Absolutely. In the REST world you are typically taking advantage of existing web infrastructure so you can do idempotent GETs, you can cache the results from idempotent GETs and so on and get performance optimizations in that way. In a messaging world you can't, because we see the mechanism of transfer between two services to be a pipe, that may be a HTTP pipe, that commonly is today in which case maybe the transport does optimizations but we are specifically decoupling the notion of the message from the way it's transported and you can implement optimizations to the transport level but that doesn't affect the message payload and vice-versa so those issues are decoupled in the MEST world.
Of course, I should use this opportunity to draw out an inconsistency within the REST camp: they assume a lot of REST practitioners are making use of the valuable features that the web provides, when the fact is that most people now are tunneling XML over HTTP or tunneling method plus parameters in a URL, and that's an even more horrible form of RPC than you can even achieve with SOAP and WSDL. In that case at least SOAP and WSDL RPC you can tool support to generate stubs and skeletons which is something you can't do with REST RPC. Trade market, just invented that. Mark Baker has been very good at advocating the benefits of REST, and yet now that is in the hands of developers that same kind of degenerative behavior that bugged web services, is now bugging the REST community. It's going to be an interesting learning curve for REST people to get on when they realize they have to start marking resources as cacheable to exploit the web; they have to have structured, readable URLs in order to identify resources in some sane meaningful way within their application context. And they can't just use HTTP as an XML tunnel because there are no benefits to that, over and above any other RPC technology.
Absolutely. I really think that WSDL 2.0 is better than WSDL 1.1. However I'm yet to see any services being built using WSDL 2.0. The cycle between WSDL 1.1 and WSDL 2.0 has been so many years. In fact I was working in the UK when I remember e-mailing the WSDL working group saying: "Let's call this 2.0." and that must have been at least three years ago and that is a long time to wait between releases; my concern for WSDL as much as I have concern for WSDL, which isn't very much, is that it has been such a long cycle that WSDL 2.0 has been in danger of being irrelevant or stillborn because WSDL 1.1 does the same things. Any matter of cleaner syntax, more strongly defined MEPs and so on really doesn't deal with what most developers are dealing with now, which is the operation abstraction which WSDL 1.1 covers reasonably well.