BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Interviews CORBA Guru Steve Vinoski on REST, Web Services, and Erlang

CORBA Guru Steve Vinoski on REST, Web Services, and Erlang

Bookmarks
   

1. I'm here with Steve Vinoski, one of my childhood heroes. What are you up to these days?

I can't really say what my company does. I left IONA Technologies in February and the new company is in stealth mode, so the founders don't want any details about it to be leaked out, but I can tell you that I'm having a lot of fun. It's like a breath of fresh air. This is very different compared to ten years at IONA and I'm having a lot of fun.

   

2. Can you tell us if it's in any way related to middleware or some new kind of Distributed Objects?

No, it's a totally different industry. I started life as a hardware engineer so there are some hardware guys involved and it is sort of back to some of my roots. I'm not working on the hardware, but there is middleware work involved. I've gone from being a vendor to being a user.

   

3. One could say that maybe this is reflected in the statements on your blog, which is fortunately available again. You said some not too nice things about vendors, middleware, WS-* and ESBs. Can you elaborate a bit?

I think if you go back and read my columns from Internet Computing back four years ago (in fact the first REST column I wrote was five years ago) some of them have been like this: "This is a good way of doing things using WSDL as an abstraction". Some of the other columns said: "This is not really standardized; there are too many specs, and all the usual vendor wars and clashes". I haven't really been kind to it all along but I couldn't really say what I really felt being part of IONA because that was their business. I think once that weight was lifted from me I became able to say what I really felt. It's not too far off to what I said before it's just that now it is completely honest, I have no agenda.

   

4. If you were now an architect in a large company faced with designing an architecture for a set of systems or a large distributed system, what would you chose?

I would look at REST to begin with. If you look at SOA it is more about business, about culture. It's all about how do we get our business to work together, how do we make things work together and make shared components that we can all reuse and how can we avoid duplicating effort and stuff like that. It's more about culture than it is about technical architecture. Some people talk about technical SOA, but technical SOA really depends on the product that you're using because every product is different. SOA isn't specific enough from a technical perspective to make them all look the same. Then you turn around and look at REST and it is a whole new architectural style; it's all about constraints and what you get from applying those constraints. Someone has gone to all the effort of applying a bunch of constraints to a distributed system and getting the desirable properties as a result of doing that. Why should I go and think I can do any better? The work has been done for me, and it's also defined loosely enough that if I have to tweak those constraints, I can do that. Just from a pure engineering cost perspective it makes sense to look at REST in my opinion.

   

5. So what would be the use cases where you'd use CORBA?

I would use CORBA if I had to talk to something already written and using CORBA. I started working with CORBA in 1991 and it is still around and I just got a royalty check from the book that Michi Henning and I wrote recently. Is not as much as it used to be but I'm not going to turn it down. There are industries that still use CORBA and those interfaces are not going to go away tomorrow, they are going to be around for probably 5 or 10 years. If I had to talk to something that was built using CORBA I'd use CORBA. If I was doing some very small scale system that the developers were familiar with the approach, I would use it, but if I had to build an enterprise scale system I would look at REST.

   

6. If we are to talk about one difference, one topic that comes up often in the discussions about REST is that there is no description, no contract apart from the one defined in the REST dissertation, which is the generic one. Don't you perceive this as a problem because CORBA is so strong in this regard with IDL? Is this something that's missing?

I've had a lot of thought about that as you might imagine. In CORBA there's obviously different layers, different areas that one can work on. I've worked on pretty much everything but when I was working on CORBA I've focused mostly on IDL and mapping it to languages. For what it was, I think we did a reasonable job. I know there are a lot of people who have a problem with C++ mapping, but it is written for very strong C++ programmers. I personally don't have any problems with it. There's a problem if you have to define something in IDL just to know how to use it. That doesn't really work. No one takes an IDL and says: "Here's this method, I call this and I pass this. I just look at the IDL and I know what to do".

Nobody does that. IDL is really for code generation. If I want to know how to use a service whether it has IDL or not, I go talk and talk to the developers, if they're nearby; if they're not I look at their documentation. So if you think about the REST services of Amazon or Google or any other site. They have documentation on the web, I go look on the web, I read it and I figure it out. I don't know if having an IDL would help. The interface is fixed - it's HTTP verbs. You have to deal with data definitions and the data definitions, the media types are usually defined by registered IANA types; if you want to know how the data looks you go and look at those media types or MIME types. I don't see it as being the same kind of problem as the CORBA style of Distributed Objects.

   

7. One of the main arguments I hear is that if you use a typical statically typed language like Java or C++ then from the code generation step what you get is type safety when you build up those objects that you exchange when you call those implementations. If you don't have a description language that can generate the code, you don't have your code completion in your IDE and all the stuff that we've gotten used to.

I supposed there is something to that, but I don't use IDEs. I've been having a discussion with a former colleague of mine about that in my blog comments where they said "You should be using IDEs and everything". I've always used Emacs. I've tried using Eclipse and it does some things nicely but I guess I'm just an "old dog". When it comes to the type safety problem you can call it pseudo-type safety at best, because I can take a message that was supposedly type-safe in my client application and send it to your server and your server can be compiled with completely different definitions and still be able to read those bits off the wire and somehow they look like they fit your message definition, where the two definitions could be completely different. Similarly your object or service or whatever it is that I'm getting type safety from using an IDL could have completely different type in reality than what I have in my client, because it is all distributed. Your versions change at a different rate than what mine change at ... it's sort of pseudo-type safety at best. But I think that whole thing turns the whole equation because you're building a distributed system, you're not building a local program and distributing it, but you're building a distributed system and you happen to be writing pieces of it with the language that you've chosen. I think the focus should be on the distributed system and making a particular language easier to use in that context is the wrong focus. I know a lot of people disagree with me.

   

8. You spent quite some time discussing dynamic languages. Can you elaborate a little bit on that? I wouldn't have expected it from an old C++ programmer to suddenly switch to Ruby.

By "old" you mean that I've been using it since 1988, right? Not that I'm old ... I've been a C++ programmer for a long time, but I've also been a dynamic language fan for a long time. My degree is in Electrical Engineering but I've never taken any form of computer science classes. I always felt I had to learn computer science on my own and back when I was teaching myself different languages, C & C++ primarily, I didn't have anyone else around to bounce ideas off because I was in a hardware group. When I joined Apollo Computer in 1987 I started working with some software people, but they were primarily embedded developers mostly using Assembly language and some using C. I started using C++ and that just freaked them out. C was radical to be used in that environment, and C++ was completely off the charts.

I didn't have anyone to bounce these ideas off. Maybe I was missing something, I should be looking at all kinds of languages, not just these. So I just studied languages constantly on my own. I looked at pretty much everything. Not that I developed real applications in them, but at least I read books about it. I also got involved with Unix early on. There was a hardware test machine that I had to use; I had Berkeley Unix running on it so I learnt Unix on my own. Learning all the tools of Unix, the greps and the seds and the awks, and when Larry Wall came out with Perl I looked at it and I said "Well there's this all this other stuff I've learned but it's all in one language". In 1988 I ported Perl to Domain OS, which is its Apollo's operating system and I think if you still find my name in the Perl source for doing that. The dynamic language stuff goes way back to the same year I started using C++. It's not a new thing; I've done it all along.

   

9. When you mentioned that instead of using CORBA you would now use REST, is the same true for the language thing as well? Would you now rather use Ruby or another dynamic language instead of C++ or Java?

I do tend to look at those languages first; sometimes they are not the right language. What I like to do is take multiple languages and just have them at my fingertips and look at a problem and say: "What's the easiest way to solve this? What language would make this easiest to solve?" Not only to solve but easiest to maintain going forward, easiest to extend. I look at the problem domain, I look at the languages you have in your tool box and chose the right one. While I prefer dynamic languages just because they are so capable, they are very brief; you can write programs that are at least an order of magnitude smaller than Java, C++ or C and still do the same thing. They are fast. People tend to say they are slow but that's not usually true. Some are slower, some aren't. Python is very fast. I don't rule our Java or C++. I'm not a big Java fan, to be honest, because if I want to use something like that I think I will go to C++. If I want something that's totally different than C++ I go to the dynamic language side. Java for me is too close to C++ to make that much of a difference.

   

10. You spent a lot of time playing with Erlang recently. I don't know whether playing is the right word, but I saw you implementing Tim Bray's Wide Finder. Can you give us a little background both on the Wide Finder idea in general and on your experience with Erlang?

I've been looking at Erlang for a couple of years actually. I haven't been using it for a couple of years, but probably two years ago I started seeing references to it. Usually someone says "There's this language you should look at it" and my initial reaction is "Ok I will take a look". If I don't see an immediate use for it, I'll get back to my real work. That is what happened, but it sort of intrigued me because of the reliability and concurrency aspects that it has. Being a long time middleware developer I spent a lot of time trying to make sure that things are production-hardened. Getting messages from here to there, translating data, that's the easy part.

It's when the thing has to stay up, it has to fail over in case of problems with one of the nodes, or you need fault tolerance, all the reliability issues, and then the whole concurrency thing which is where you spend a lot of time just figuring out ... I've got a lock this piece of data, shared across these threads and if I miss one bad things are going to happen. Those are two hard problems areas middleware developers deal with constantly. I look at Erlang and it is sort of built in. That may bear more investigation so I sort of kept looking at it. When I was at IONA I was working on the advanced message queuing protocol implementation that Apache is working on; it's called the Qpid project. I was working on that and someone asked me to look at making it fault tolerant. I said "If you are going to make it fault tolerant you should be doing it in Erlang, it would save a lot of trouble."

Two weeks later a company called Rabbit MQ comes up with an Erlang version of AMQP. They had obviously been working on it for a while. It's still around and people are using it. I guess I wasn't too far of the mark there. When it came to Tim's Wide Finder ... Tim Bray works for Sun and he wanted to analyze his weblog; probably a quarter gigabyte of data for the smallest log, a lot of data to analyze. He thought at "Sun has this new machine coming out. How could I make use of a language like Erlang to parallelize the analysis of this data?" He wrote an Erlang program and he was very unhappy with it. If you go back in his blog you can see he's quite unhappy and he thinks Erlang is not what it's cracked up to be.

I saw that and I thought that maybe I can do a little bit better. I started working on it, other people in the Erlang community were working on it. We just saw the time dropping. I think Tim's initial stab was at 30-40 seconds to analyze this particular data set. I got it down to 2-3 seconds. A real Erlang person took it over and he got it down to around .8 seconds. I think now the fastest implementation of Tim's system is in something called OCaml, which is another functional language, Python is number two and Erlang is number three. A lot of people say that Erlang can't do file IO and that it's really bad at that, but obviously it must be ok at that because it is pulling in these huge data files and it analyses them on the top ten hits on Tim's website.

   

11. Do you see this as something that will continue to happen, that languages become more powerful instead of a general purpose language with a huge set of libraries, tools or middle-ware that sits below it or adds to it? Is this a trend that languages include features that we expect to be in libraries?

There's a couple of things that are going on in the whole concurrency thing with the multi-core systems ... when you have two cores you can take any old application, throw it on that machine and it's going to do ok. When you have eight cores it gets a little more interesting because you can see that some of them are kind of idle maybe when you run your application. If you don't have the right language to take advantage of that than your applications can use one of the cores. There's nothing the operating system can do to help you because it's not going to take your application and break it up for you. You have to explicitly go in and make it multi-threaded. Threads in languages like Java and C++ are fairly heavy weight. Even though they are lighter than process, they are still heavy-weight. It takes something like Erlang or languages like that they have very, very light-weight threads so it's able to run 50-60.000 threads on my Mac book pro easily. It is a very different style of language.

Then there's also the object-oriented versus functional - and there seems to be a resurgence in functional languages right now. I don't know why that is; it may be because they are so small like you can do so much stuff in just a few lines of code. And even languages like Ruby and Python have functional aspects to them; that may be what's driving it. I think there's a bit of resurgence in language design and people looking at languages. There has always been C as the Assembly language for higher-level languages; not only C++ but others like Python, Perl etc are all built on top of C. There's a lot going on in Java. Java is like the assembling language for the JVM, it becomes the VM for a number of languages, like Scala, Groovy, and Jython. People are moving into these two directions, it's the same direction in fact: building smaller languages better suited to specific problems on top of these general purpose languages underneath.

   

12. Of all those languages that you mentioned which one would you recommend?

I think the past decade or two there has been a search for the language. A lot of people felt C++ was maybe the language that people should be using; then Java came along and a lot of people latched on to Java. I've met many programmers who seem that all they know is Java. If you start recommending to them that maybe they should start looking at other languages some of them get argumentative and they say "Java can do it all!" I think if you were to talk to the people who built these languages they would never claim that their language can do it all. All through this, there have been the multi-language communities that have been rolling along working on these other little languages. Erlang is twenty-one years old, Smalltalk has been around forever and people still use it. I think because of the way that no language can do it all developers really owe it to themselves to learn multiple languages and be able to integrate them.

When you have that choice, when you have a toolbox full of languages and you have a problem and solve it in two lines of Ruby versus two hundred lines of Java it's a really nice feeling. It just makes you a better developer because you start to see how idioms in different languages can be applied and you learn from different languages. In Python there are list comprehensions which are very cool; there's one line that can do all kind of stuff iterating over a list. Erlang has the same thing. You go to Erlang and you say "That's a list comprehension that's almost the same syntactically and does the same things". It's not like every language is a whole different world that you have to completely start from scratch. You learn one, you see some of its idioms, you start to learn another, and you see similar things.

Switching from a OO language to a functional language is going to be a little bit different. Languages like Ruby and Python in particular cross those boundaries and using those you can get a lot of work done and also expand your own horizon at the same time. In terms of concurrency, if you're writing middleware I think you owe it to yourself to look at Erlang. The language itself has the primitives, then there are libraries called the Open Telecom Platform that come with it, that build on those primitives to make reliable software almost simple. It's never simple, but compared to what you have to do, jumping through hoops in other languages, it's kind of a no-brainer. So - there is not one language, look at all of them.

Feb 26, 2008

BT