Ok, I am a consultant, not an analyst, everybody thinks I am an analyst, but I am a consultant at the Burton Group, which is a research and advisory company, an analyst company if you will. It has a consulting arm, which I am a part of. We are segmented; I am in the Application Platform Strategies group, which focuses around basically development in the enterprise, as opposed to say Network and Identity and other services that we have. I have been with Burton a little over three years now. Prior to that I was with Systinet. I know you are familiar with Systinet, but they were one of the first providers of commercial SOAP, WS-* implementation – not that there were that many WS-* specs when they started. They are now part of HP, they were bought by Mercury. Since I will presumably be talking about REST and web services, prior to that my background was only tangential at CISCO, but I was at Netscape for a while, where we got to play with some CORBA and of course all the early web stuff. And then it goes backwards into places that nobody cares about. My background as far as web services go, really starts with Systinet in early 2002.
Yeah, I mean I guess I am definitely a vocal REST proponent and to I hope a lesser degree a WS-* basher. But I wasn't always this way. When I was working for Systinet I was a true believer but then, as many people have gone through this experience, the more you try to use it the more time you spend bashing your head against the wall, especially in those early days where interoperability was incredibly difficult, and then you dive deep into the XML schemas and specifications, and you looked really closer to the WSDL specification, and find that there is a lot of cockroaches in the corner. And while I knew of REST, I wasn't paying attention to it. I knew it was there, and I knew it was something I had to look at but didn't really start looking at it until I started getting pretty frustrated with WS-*, with SOAP I should say, because it was really SOAP at the time.
My movement into the REST camp, it's lengthy I think like everybody's. It started not as looking for a pure architecture to build scalable distributed applications; it started with "I need something simpler than what I got". So I started looking more closely and it's only over the time and I still have "Aha" moments. But over time you learn that it's not just a simpler SOAP which is where I approached it from originally but it actually is a complete application protocol for a distributed computing system. And it has a lot of very interesting properties around it. And then when you dive deep into and actually try writing some code, it becomes much more attractive.
Is there something fundamentally wrong with it? Let's even imagine the simplest case, where it's the same platform talking into each other, client and a server, so you are using .Net on both sides, or you are using one of the Java implementations on both sides, so interoperability is not a concern, that's basically what I am saying. Though I think interoperability issues still exist and probably always will to one degree or another, in that case when everything is working as advertised, you can't say that there is something fundamentally wrong with it. But what you can say, is that it is fundamentally mispositioned. Despite the ambiguity around the term, SOA – Service Oriented Architecture–
I think it is safe to say that the two go hand in hand, SOA and WS-*. One of the things SOA advocates is that you make as much business logic a service as possible. Everything is a service, right? And if everything is a service and the WS part is assumed, then it is assumed that every service should be a WS-* implementation, a SOAP service. And that is fundamentally wrong, because SOAP has properties that do not make it ideal for general consumption or large scale consumption in the enterprise or without.
What I would say is if you are building services explicitly for re-use, or maybe you're a little bit more enlightening and you are building them for their possible emergent properties, what you want is a platform or an architecture that gives you by default the scalability that you may need later on, gives you the accessibility that you really want now, and gives you a large number of non-functional requirements that are very important and shouldn't be shoehorned into the architecture at a later date. And that you reserve your use of SOAP WS-* solutions for those needs where REST doesn't apply, for instance pub/sub, and for those solutions where REST doesn't apply and you have reasonably tight take control over both sides of the conversation, in other words you, or your team or a very closely related team are building the client and the server in lock step with each other. Because, frankly, they are just too tightly coupled to be used as a service for general consumption.
I miss that one. When I was selling Systinet's products I talked about loosely coupled, when I present information for Burton Group on SOA I certainly stress "Trying to do your best to loosely couple client to server", because obviously loose coupling is a very good idea. The issue and I have said this before in print, is that you can strive for the largest amount of loose coupling possible in the SOAP WS-* world, and when you do the best job possible and when you obey all the best practices, you still end up with a tightly coupled system.
And you know the answer, but for the audience out there the fact of the matter is when you are creating a SOAP web service, a client, you have to know a great deal about that service, whether you learnt it via WSDL or some other mechanism, you have to know a great deal about it, and generally if the service changes in any not even terribly significant way, your client has to change with it or it will simply stop working. And you can go through heroic efforts to keep the existing clients alive and you can try and do your best at the design and development phase to loosely couple them, but the fact that the client has to know all the operation names and all the message formats before it can get any value out of a web service, is tight coupling to me.
Where of course in contrast all I need to know about a RESTful web service is its URI. Now I might not be able to derive all of the business values that the service can offer, but I can "GET" it and maybe something interesting will come from that, and it may be all the information I need. So properly designed RESTful system is dramatically loosely coupled, whereas a properly designed SOAP WS-* based system is unfortunately tightly coupled and all you can do is your best effort to avoid more tightly coupling than necessary.
5. Can you give us a three minute introduction to REST?
Yes, it actually can be done. The way I do it is by stealing Dave Megginson's elevator pitch that he had in a blog post, I don't know, less than a year ago. But he has a one bullet point elevator pitch for REST and it is "every piece of information has its own URI". And I like to just say that to some people and then not say anything for a few minutes and let them ingest that. Because it's extremely powerful and it has implications. It means that in the REST world you have a large universe of services, resources if you will, whereas you might think in a standard simple enterprise system something that does HR type stuff, you can have literally millions of resources.
And each one if them is URI addressable, it sounds unmanageable but obviously every URL is a resource on the web and there are trillions. And we manage … Now if you think that your sales for the last quarter have a URI and you can get that in your browser, in a Shell script, Ruby script, Excel and then do interesting things with that information, maybe just memorize it: "Look at our sales for Q2". But if the sales for Q3 and Q4 and all 2006 and yesterday, and last week, every basic division of information is reasonably important to somebody has a URI, then everybody in your enterprise or everybody who has access to this system, this resources has the information they need at their fingertips.
No longer is information in a walled garden in your section IT group, or in your shadow IT group, it's a URI away. If you think about, it what Google does for you on a daily basis, every time you want to know something, you go to Google, and Google it up, and it sends you to Wikipedia or some relevant blog entry or relevant page. If I could do that in enterprise I think that would be fantastic. I know that Google's search appliance isn't quite the same thing as Google, obviously.
But imagine if every single piece of information in your enterprise had a URI and then you spidered it and then you had a little search box on your portal web page, and you type in sales Q2/06 and then they came up, and there they were: in XML format, HTML format, as a pdf, in somebody else's mash up, in the internal blog post, and somebody says: "The Q2 number were fantastic". I just think that's eye opening. That was probably more than three minutes, but it was really kind of rant. The main point is you got a lot of resources and every one of them is individually accessible.
7. Right. Like flickr and del.icio.us and many others.
Yeah, ok, all the examples that I held up as a way not to do things. The answer is pretty straight forward: HTTP is a specification in a sense a contract and it defines how you interact with HTTP based resources. And it says that GET is used for retrieving information safely, where safe means without significant change to server side state. Now you are free to make GET do anything you want, and in the del.icio.us or flickr examples they obviously create and delete resources because they're basically tunneling actions into the URI, and if you do so that's your prerogative.
The problem with that is that the contract that I have with you, HTTP, says that I am free to GET anything I want and if my GET happens to destroy resources or create them on your side and that's not what you wanted, then it's your problem, it's your fault and that sounds kind of arcane but I guess but I think the best example is the Google Web Accelerator example from two years ago roughly. Google is out there spidering, and your browser is out there pre-fetching, Firefox can fetch pages ahead of time that the web server can tell you if the person who took this page is much likely to go to this page, so go get it now, and you click on it and it pops right up.
And there are all sorts of times when non humans are trolling through the web, and they're issuing GETs by the truck load. And in general it doesn't hurt anything because those destructive GETs tend to be behind log-ins. But what happened with the Google Web Accelerator thing is that it basically was a proxy on your desktop so when you logged into some application, so did Google and it started spidering, and pre-fetching and proxying the website and started deleting stuff all over the place. The Rails people who were deeply affected by this made some very quick changes, everybody learnt a very valuable lesson, but the lesson is that everybody and everything is free to do a GET, and if that GET does something it shouldn't and you think it's safe because it is behind a log-in, it's not. So obey the rules, play safely.
8. How do I actually map my business operations to REST?
Yeah, this is the sticky part, it's not sticky technically but this is the point where people who might be understanding or seeing the value have to cross a bit of a bridge. For years and years everybody has been writing objects, classes with methods and SOAP and similar worlds, CORBA, DCOM world, you are writing services with operations, and it is very difficult to try and change a point of view and say "I used to have a service called 'pay bill' or an operation called 'pay bill' but now I only have Get, Put, Post and Delete and there is no way I can make this work". You do have to learn how to do it.
I found that once you had it, it becomes very difficult for you to think the other way, so it's really kind of you are altering some path in your brain. But the trick is to think of everything as a noun, and to realize that you have an infinite universe of nouns at your disposal. So, in the instance that I just made up of pay bill, in the SOAP world and the service world and you would have a billing service and you will be able to retrieve prior bills and you will have a get bills, you need an unique identifier or a date for a bill, and then you will have 'pay bill' and maybe set up recurring bill payments and all that stuff.
Whereas in the resource world, you have to think that there is this thing out there called the bill; an invoice is probably easier to say, an invoice. And in fact there's lots and lots of invoices, if it's say my cell phone bill, I have one every month so I know that as long as I have been a customer of my cell phone provider, wire less provider, I have invoices out there, and they are sitting there, and when I want to get them I simply get them: "get, in my case Cingular.com/invoice/whatever date" and it comes.
And when I want to pay it, in the case of Cingular, what I have is some kind of payment resource that I would post my check to. And it will apply and do stuff to backends, and you have to think one easy way to doing it is to think what happens if you are living in a paper world, and you only have to deal with humans, you have bank tellers and you have deposit tickets, and you have a lot of nouns and the action is really kind of simply giving these deposit tickets and money over to these nouns, and then back in process it happens, the other way comes something. It's a mind shift, it's not necessarily easy, I think if you do one or two reasonably real world RESTful applications the shift is made in your head.
That's a question I am not so sure I am going to be able to answer extremely well. This is the kind of business/IT alignment question. In the business world I have identified a service that might make payment to whoever you owe money to. And then I go to the IT world and I say this we may need a make payment service, and I line all those things out perfectly along the business the IT group is doing exactly what the business group does, and I think that sounds really nice and I certainly think that mapping is appropriate when you are gathering requirements and doing business analysis with the customer, the business unit, but when it is time to implement that in software, I don't necessarily see as the business wins because it happens to have a service of a known name, simply because they really can't get to it.
But the business would then definitely have another win if you showed them the same functionality with all of the properties of the web at their fingertips. Not only is there a series of resources out there that you can get – you wouldn't say this to your business, you would say "Just put this in your browser", and a web application might come up, and remember it's possible to have a web application that is entirely one hundred percent RESTful.
So you might design it in such a way and I think people will get websites, and I think people get URIs, get in the understand/fathom sense of the word, and I would imagine, and I haven't seen this for a stand, but if you mapped the business requirements to a web based system that was entirely RESTful, the business unit would get it but be twice as happy or unreasonably happy that they actually can get it with a capital GET. I don't think there is a lot of benefit by saying that we have created a something in software called 'make payment' and you have a business process called 'make payment' and therefore everything is great. I think the idea is to discover the business process then go back over here and map that to the technical process that has the best win.
Well there is a whole bunch of them, at least a dozen or so. The most popular ones that some people would say that you can't prove this but we can, if you obey the constraints of REST that's to say "You shouldn't do this and you should do that", if you obey them all, then those constraints have been put there to induce what I like to call non-functional requirements into your business application. And those properties, those non-functional properties, are scalability, which I give two definitions for.
One is the ability to host a large number of clients, whatever value of large you want and also the ability to have a large number of clients, which is really important, if you can host a hundred thousand customers on your RESTful web server, the next important thing is to build it to get a hundred thousand people to interact with your service. There's scalability, there's performance, which is primarily influenced by caching, which is built deep into the bones of REST and HTTP, there is simplicity of architecture, not necessarily of development, and this is especially true if you are a client side developer, the point is that we don't want to hide the network from you and so you actually have to do more work as a RESTful developer of clients that you would simply by spitting out code from a WSDL.
Simplicity of the architecture. Modifiability which has a number of properties, but the ability to extend a system that will be simply introducing new resources; the ability to evolve the system over time without impacting the clients, or at least impacting all of the clients; the ability to customize behavior at runtime comes out of the RESTful properties. What do I say now? Simplicity, scalability, modifiability, performance, the rest are escaping me right now. It's a powerful list of things that you would generally say: "I want that in my application", and if you do everything by the book then you get that more or less for free.
The answer is yes and no depending on which of one of those you pick. I mean from transactions for instance I don't think that transactions are missing. I actually have a feeling, I could be wrong, I don't think I am, but I think that nobody is actually going to use distributed WS-* transactions simply because that's never going to scale in any meaningful way, horizontally or vertically. So that's a case of you aren't going to need it I think. But if it turns out that you do need it, we actually have some nice patterns for you, and if you don't want to use the RESTful patterns for transactions, nobody is going to come and arrest you for cheating and inventing your own kind of transaction processing on the fly if you care to.
But you also said security and I would say there is kind of a big gaping hole as regards security, in a sense. The security picture of REST, REST doesn't talk about security really. We talk about security for HTTP and that amounts to SSL, HTTP Basic Authentication and HTTP Digest really it's about it. Now the good thing is that SSL just rocks my world. It's proven, it's been beaten to death for over ten years, trillions of dollars for business are conducted over SSL, it does its job. In fact there is a very strong argument that says you don't have security unless you have SSL. You combine SSL with HTTP Basic and you have a lot of what you need. You can meet more that ninety percent of all your security needs.
But ten percent is not an insignificant number and there are some things that I would like to see in the security picture that aren't there yet. One is the SSL problem of blasting right through all your intermediaries. One of the REST properties is visibility to monitor intermediaries and that makes caching work and that makes all proxies work, and with SSL you just don't get it, it just ploughs right through them from endpoint to endpoint unless you terminate the connection and redo it on the other side. But again a lot of people would say that's exactly the behavior I want, I don't want any man in the middle to be able to see it.
It would be nice if we had a mechanism to do some of the things we might need, like encryption and signing without making the entire data stream opaque, especially signing, because if you want encryption you really want encryption end to end, not just the message body but the headers as well. And this is what Jim Clark is working on these days and we will see if anything comes from there. The other issue is HTTP Basic. I mean everyone knows it's not secure but it's fine with SSL, but it's limited in the fact that it's using name/password. You can't easily do multi factor authentication without breaking out of HTTP. You can't easily do clever things like giving a certain amount of authority to others on your behalf.
If you have a web based calendar with HTTP I can't give you necessarily limited access to my calendar … certainly you can write an application that does that but that's behind the scenes. We can use a few things, covering ninety percent of the use cases and kludging those other ten percent has gotten very far but I would like to see more.
Shine up my crystal ball here. I don't know how good my predictive powers are but if I were forced to make a seven cent bet as we do internally at Burton Group, there has always been seven cents, on the future, then I would say: is WS-* going to disappear? No. Is it going to be significantly marginalized? Yes, I think it won't be marginalized at the point that CORBA has been today so obviously there is still lots and lots of CORBA systems running and they're being maintained, but no new CORBA development is happening, I don't think it's going to go that far.
I think WS-* will be used in those use cases I mentioned before, where you absolutely need some of their functionality and you have tight control over developers of both the clients and the services and you can move all forward at the same time. In those cases in an enterprise kind of where you really really need MQ Series, in the future you might be able to get by with a web services solution in that case. And that might provide more value. I am the person most reluctant to check out a working system. And if MQ is meeting my needs great, but if it turns out the ability to view the messages in flight is important to me, or the ability to run those things through some intermediary and perform other business processes and technical processes, on this is important.
I think that's going to have some value. People will continue to do WS-* simply because it's baked right into the Microsoft DNA right now, similarly in Java. IBM is still pushing it, so is Oracle, so is Microsoft, so it is not going away quite soon. But it will be increasingly marginalized to do what it does well. And in the other side of the coin as far as REST goes, I would love to say that everybody is RESTful in the strict sense of the term, in all other use cases. I predict however that everybody will start using the web, but there will probably be a large number of flickrs and del.icio.uses created internally because they get the job done or the development team hasn't invested in really understanding or we probably go through a phase where people start doing it and do it incorrectly and then learn over time.
So I do predict an increasing marginalization of WS-* and an increasing acceptance of REST, but I think the acceptance of REST will be inelegant for a while until ultimately people get it and it just becomes as natural to us as writing a SQL query does or using object oriented programming does. We had to make that jump from C or whatever you were writing in to Java, from structured languages to object languages and I am sure there is a lot of really bad OO code out there from 1995. But now everybody does it without thinking and I didn't make this leap but a lot of people made this leap from non SQL databases to SQL, but now it is natural, a few more years REST will be natural too.