BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Interviews Neal Ford On Programming Languages and Platforms

Neal Ford On Programming Languages and Platforms

Bookmarks
   

1. I am with Neal Ford at QCon San Francisco. Neal why don't you start by telling us about yourself and what you have been up to?

I work for ThoughtWorks, we are a global consultancy company and my title at ThoughtWorks used to be Application Architect, but too often in the US that title means post-useful or it means I wrote some rocking good code 10 years ago in COBOL. One of the advantages at ThoughtWorks is that you can pick your own title, so I've made up a title of Meme Wrangler: Meme as a unit of thought - that's the mathematical term for unit of thought, unit of information - and Wrangler because you wrangle stuff. So my official title at ThoughtWorks is ThoughtWorker and Meme Wrangler, but the role that I play most of the time is architect, technical leader on projects and I do that in .NET occasionally and Java and now a lot of Ruby projects.

   

2. I heard that you are interested in alternative languages and the JVM, can you tell us about that?

That's one of the really interesting spaces in Java right now because Java actually consists of 2 different things: it is a language, but it is also a platform. If you look at the Java language, the Java language looks pretty old, I mean it looks like a 12 year old language. We've learned a lot about languages the ensuing 12 years since Java language came out, but it started out with lot of baggage because it was trying to be backwards compatible with C and C++ to attract members that pre-stood.

The problem with that backwards compatibility is that at some point, it ceases being backwards compatibility, it becomes just baggage and Java has a lot of baggage. It needed that stuff when if first came out because it had to attract members of C and C++, because if it didn't, it probably wouldn't have been very successful. But at some point - there is an inflection point where - that switches over making an advantage to disadvantage and we are clearly in the space now being at disadvantage, even in the core Java language stuff, even simple things like counting from 0.

Counting from 0 makes no sense whatsoever in a language without pointers, in a language with pointers like C it makes perfect sense because the array is actually just a pointer to a chunk of memory, and if you take index number times thesize the elements of the array is the offset into the chunk of memory, so it's a nice, efficient, way to create arrays. But in Java we don't have pointers, and so it doesn't make any sense to have 0 based numbering in Java, we should have 1 based numbering, but they wanted that backward compatibility with C, so now we are left with that baggage. Fortunately though, Java is in 2 pieces: language and platform.

And we can take advantage now of the Java platform and start using the Java language as the assembly language of the Java platform and write in much more expressive languages like Groovy, for example, which is kind of Java with a modern dialect. Groovy is where they try to take Java and push this forward towards Ruby as they can and still have some of the same kind of common things that you are used to in the Java world. Looking at something like JRuby, which is a really nice modern super powerful language that runs on the JVM and now compiles to Java bycode. Ruby is a much more expressive language than either Java or Groovy.

In fact, it's probably the most powerful mainstream language out there today and, as Java developers, we can take advantage of that because now JRuby runs on the Java platform and it's not like some sort of special thing in the Ruby world, it's literally just another port of Ruby to a different platform, just like you have a Windows version of Ruby and a Mac version of Ruby, and now have a Java version of Ruby. And that is really significant, because Ruby is such a powerful language. If you look at Paul Graham, he has a great blog entry or rather essay about the power of languages and he has a yardstick where he compares language power. If you look at something like Java or C# or modern statically typed languages, they score 4 out of 9 on Paul Graham's scale.

If you look at something like Ruby, it scores an 8 out of 9 on that scale, so it's a very powerful language and a lot of that is embodied in Ruby on Rails which takes advantage of a lot of the super powerful stuff in Ruby. We've had a lot of experience within ThoughtWorks of Ruby, we are doing a lot of Ruby projects now and we have clients coming in and asking us to do Ruby projects. We have just released our first ever commercial piece of software which is Mingle, which is this Agile project management tool that is written in Ruby on Rails because we wanted it fast on the market, but it is deployed on JRuby, because JRuby deployment scenario was much easier than deploying it in the standard Ruby on Rails way, with hosted and Mongrel clusters and all that stuff.

So we've got to take advantage of the best of both worlds, we've got the productivity and the power of Ruby on Rails, but we've got the convenience of having deployed this thing on the Java platform. I think that is something we are going to see more and more. I coined this term, about a year ago, of a polyglot programming, where polyglot is someone who speaks many languages. I think that this idea that there is one perfect language for everything is flawed. I don't think that is true because there are lots of purpose languages out there that make it much easier to solve problems. In fact, we already do this all the time, because virtually, every application you write is written in some sort of base language like Java, but then you have to talk to a database, you use SQL and you have to write some sort of web UI so you use Javascript. Of course, XML has its own language because every XML document has its own grammar, therefore its own language.

We are pretty much already doing this, but we can take this a lot further now because of the way the Java platform works and because of these alternative languages on the JVM. I think you are going to see a lot more of that - of using special purpose languages - to solve very specific problems. One of the problems we are going to run into really soon is this problem of concurrency. Because Moore's Law has stopped making processors more powerful by adding transistors, what they are doing now is adding cores, which means that our code has to be much more intelligent about multithreading. We don't have to worry too much about that right now, but 3 years ago we didn't have to worry too much about Javascript either because we had gotten our users used to looking at static, not very interesting web pages, and they were OK with that until the guys at Google came along and showed people that web applications don't have to suck anymore.

Then they came out with Gmail and now all of a sudden we have to be able to support the same kind of stuff, which is the rise of Ajax. I think the same thing is going to happen with the concurrency world. We don't have to worry that much right now, but somebody is going to come along and with some sort of killer application which is really going to take advantage of the power of the processes we have now, and all of a sudden we are going to have to start worrying about in our applications so that our applications don't suck again. Writing good multithreading code is really hard, especially in Java. If you doubt that, there is a great book called Java Concurrency in Practice by Brian Goetz that explains once and for all how difficult multithreading is in Java.

The nice thing is that we don't have to do this in Java anymore. With this idea of polyglot programming you can actually write the multithreaded part of your code, but that's going to be really hard to write in Java. Don't write it in Java. Write it in something like Scala which is a functional language that produces Java by code or in Jaskell, which is a Java dialect of Haskell. Those functional languages are inherently thread-safe so you don't have to write any synchronized logs or worry about contentious, you just write your code and the language takes care of the concurrency problem for you.

The problem, of course, is that it's hard to write an entire application in these languages, but you don't have to, because you can write just the nasty multithreaded part in the functional language, write the UI part in something like Ruby on Rails, write the messaging to the mainframe part in Java because you have already got a library for it and deploy it all on the exact same Java Virtual Machine. That's actually a much easier problem than things like object relational mapping because in that case you are polyglot programming, but you are crossing machine boundaries, and that's always difficult, but doing it for the exact same JVM you can actually do it just on the same JVM and the mismatch problem is much less troublesome. You do have to worry a little bit about formats, but most of these languages are run on the JVM, support and respect that Java variable formats, etc. so they very easily interoperate with the underlying Java stuff.

In fact, even in JRuby you can turn around and code Java libraries, if you want, and you could do things like map, Ruby mixes to interfaces in Java. They've gone a long way and they've done a really good job of making a language that is so fundamentally different from Java, like Ruby, work with and interoperate very nicely with the Java Virtual Machine, meaning that now you can start taking advantage of all this really cool neat stuff and still deploy it on the JVM, but not frighten your infrastructure people to death because you are switching languages on them.

   

3. You mentioned Groovy, so what do you think is the importance of meta-programming?

Both in Groovy and JRuby one of the things that they have up on Java is the ability to do meta-programming, which is programming that actually affects your code. "Meta" means "outside" or "about" and just as metadata is data about your data, meta-programming is programming your program. It's one of those things that sounds like computer science, esoteric kind of topic, but it turns out that it's not. And that's one of the things that Groovy and JRuby really have up on Java itself, because Java has very limited meta-programming support. You can comment methods and view via reflection in Java and there are few places where it makes big sense to do that: one of them is the ability to build the test private methods.

There is an add-on library for JUnit called JUnitX, which is a reflection of your code's private methods so you could test them. Java is pretty locked down as far as meta-programming is concerned. You cannot, for example synthesize new methods on the fly in Java because the language's flat-out won't let you. That's one of the things you need to be able to do: to make really powerful programs where you don't have to write a whole lot of code. That's an important issue, because it doesn't mean that a lot of the stuff that you do in Ruby is impossible in Java, but what it means is it's so incredibly difficult that you just never do it.

For example, you can actually write in Java a way to open a class and add new methods to it, but what you end up doing to make that happen is to write another language on top of Java, which is exactly what the Groovy guys did. They actually sit on top of Java and write and produce Java bycode, but it's a completely different language syntax, so they are producing bycode differently from what Java does. And they will allow you to do things like open classes, so you can go in and reopen java.lang.String and add new methods to it, which is important because every Java developer on every project has a String utils class floating around.

That's crazy because Java is an object-oriented language and the way you interoperate and work in an object-oriented language is you sub-class things, but the fundamental types, like string, you are not allowed to sub-class, they made it final. I don't know why. What they do is force you to switch to procedural coding and create a String utils class when you should be able to sub-class String and open the String class and make changes to it. The Groovy guys let you do that and so do the JRuby guys and the way they both do it is by wrapping a proxy class around java.lang.String. When you make method invocations on java.lang.String it's actually making invocations on that proxy class, which is where you add these new methods to it. That's a really important capability because you need to be able to do that.

On every project you'll ever work on, the stuff that is in String is not sufficient to do all the work that you need to get done. So Groovy takes meta-programming to a certain level and makes writing Java code much easier, but Ruby takes it even further because in Ruby you can do not only open classes but you can also do a lot of really sophisticated stuff at runtime and interoperate and interact with the runtime in a way that neither Groovy nor Java will let you do. Because Ruby was designed to be this really super powerful meta-programming language and so there are several things that you can do in Ruby that are either difficult or impossible in either Groovy or Java. One of them is the ease with which you can eval stuff: put code in a string and then turn around and execute this code.

They can do that in Groovy but you have to create a Groovy builder to be able to do that. In Ruby there are actually several different ways you can do that, which controls the scope under which that code gets evaled. So you can coerce Groovy into doing something similar to what Ruby does there. But Ruby has a lot of sophisticated stuff behind the scene as well and this is one of the things that makes Java developers heads explode. One of the things you can do in Ruby is open classes like ArrayList, you can add new methods to ArrayList. Maybe that's too frightening, maybe you don't want to add a new method to every part of ArrayList. What you can do in Ruby is actually add a new method to an instance of ArrayList.

It means that the new method only exists for that instance of ArrayList and that's a really funky thing to be able to do, but it turns out it's really useful because Ruby has this idea of a shadow meta-class, which is for every object there is a special class that sits behind it, where you can add methods to the instances of the class. There are a bunch of places in Ruby on Rails where you can take advantage of this because it narrows the scope of your open class before you are adding methods. It allows you to add new methods to classes in a very dynamic way, in a way that most Java developers wouldn't even think of doing, because that facility just does not exist at all in Java and nor in Groovy.

Having a really powerful language of meta-programming is super important and here is why: for the last decade or so, the way that we have tried to achieve productivity in the software world is by creating these locked down environments: Java is an example of that, C# is an example. There is the idea that we are going to protect bad programmers from doing too much damage and so you create strongly typed languages with static typed systems and you restrict things like being able to subclass String and Object and some of the important internal types because you are afraid the bad developers are going to make some sort of horrific mess out of it. Glen Vandenberg, an acquaintance of mine, has this great quote which says "Bad developers will move heaven and earth to do the wrong thing".

And it is true, my experience weaves that out too: you can create environment really restricted just to keep bad developers out of trouble, but these restricted environments harm the productivity of your best programmers. Basically what you do is: you are not speeding up your bad developers and you are slowing down your best developers and that's why our productivity stinks in software right now, but the attitude is changing around. Ruby on Rails is a good embodiment of where I think we are going to as an industry and that instead of creating a restricted locked down environment and then adding capabilities to it with frameworks, which is exactly the way we extend Java. For example, AspectJ: an aspect is the perfect example of the kind of facilities that should exist in a language, but they don't and so you have to write a framework that actually weaves its code into your bycode to give you those capabilities.

You don't need that capability in Ruby because the language itself has all that capability built in. The problem with things like Aspect is a very complex one: it's a completely different language syntax and different compiler; it's a massively complicated thing. In Ruby you don't need that because the language supports that concept, the core. So, instead of creating a locked down environment and extend it with frameworks, what you do is the opposite: you take a super powerful language and write simpler abstractions on top of it, and that's exactly what Ruby on Rails is. Ruby on Rails is a domain specific language that sits on top of Ruby, which greatly simplifies building web applications in Ruby, so you don't have to understand meta-programming and all that other sophisticated stuff that goes on underneath Ruby. If you do, you can take advantage of that if you need to.

So you can just drop one level of abstraction down into Ruby and get the really cool stuff done that you need to get done to make your life a lot easier. Let me give you a concrete example of that. In one of the projects that I was doing a little bit of work on recently, a lot of guys don't like it when tests take a long time to run, so they did a lot of work speeding up the test. They also realized that, when you run a unit test you don't want them to talk to the database because it slows things down, so they were able to write a little bit of meta-programming code that if you ever talk to the database during the unit test it would make fail your test. If anybody did accidentally talk to the database, it would just fail the test.

And then, they realized that in the functional test you want to test real things, you don't want to mock anything out, so they also wrote into this the ability that if you are running a functional test, and you accidentally mock something out it will make it fail the test. It took very few lines of code to do that because they have access to that super powerful meta-programming stuff underneath, the simpler abstraction of Rails, and not everybody in the project had to understand exactly how they did that, but everybody in the project could take advantage of it and use it. In fact, a couple of guys who worked on this, Jay Fields and Dan Manges, released this thing as a RubyGem to the rest of the world.

That's one of the things that ThoughtWorks is doing in the Ruby community. It is exactly what we did in the Java community: we started out in Java and a lot of the really important infrastructure that we needed wasn't around there, there wasn't a good way to do continuous integration job, a lot of the testing stuff was not at the part, so ThoughtWorks created that and open -sourced it. We are doing the exact same thing in the Ruby community now. You look at CruiseControl.rb, continuous integration for Ruby - that's an open source project created by ThoughtWorks. A lot of the info structure pieces that we write in our projects, like the little testing thing I was just talking about, have been open-sourced and released as the Dust gem, so if you say "gem install dust", you can get that now on your own project.

With the other guys on project, we have created a site called somethingnimble.com which is actually a blog of a bunch of cool stuff that they have come up with on projects. And one of the cool things that they recently released on the somethingnimble.com is called "mixology". The Ruby mechanism forms something like an interface, although it is not the same thing, is a mix in. You can mix in behaviors to classes, but every once in a while you would like the ability to dynamically mix in things. That's what mixology lets you do: you install it, and lets you mix in or not mix in based on criteria that you determine at runtime. That's like the ability of being able to say "I want this Java class to implement this interface, but only if this particular condition is true. If some other condition is true I want it to implement this other interface instead".

You can't absolutely do that in Java because the language is not designed to give you that level of access, but you can do that in Ruby because it has such good strong meta-programming support. That's another example of something that's trivial to do in Ruby and essentially impossible to do in Groovy because they have just never built in any of these facilities into language itself. The nice part about this is that you don't have to abandon your platform to do all these, you can write all this in JRuby and it works perfectly fine on the Java platform and you don't have to frighten your infrastructure people by convincing them that you need to switch platforms to run another completely new language. You can just give them a war file and they can deploy it, and they don't even need to know that you didn't use Java code to produce this, you used some other more productive language to produce this stuff.

   

4. You have talked about the language implemented on the JVM but you didn't talk about .NET. What do you think about that?

I actually think that this polyglot programming idea works perfectly well, in fact maybe even better on .NET because the CLR was designed for multiple languages and the JVM had to catch up a little bit on that arena because they never really intended the JVM to support multiple languages, but they designed it in such a way that it was flexible enough so that it does. A lot of interesting stuff is happening in the language world in CLR and .NET too. They started the IronRuby project at Microsoft to be able to support Ruby on a CLR just like it is now supported on the JVM with JRuby. One of the things that is interesting is the experience I had with Python - the guy who created Python is a guy named Jim Hugunin.

He created a bunch of cool stuff, he worked on some of the original Aspects stuff, and he also created Jython, which is Python on the JVM. He decided that he wanted to see what it would look like on the CLR and he implemented it on the CLR and much to his surprise Python was faster on the CLR than either the C runtime or the Java runtime. He was so impressed with that that the went to work for Microsoft and he helped produce not only IronPython but also what they are calling the Dynamic Language Runtime which sits on top of the regular virtual machine, so what they've been able to do is take some of the common characteristics that dynamic languages need and build a framework that sits on top of the VM that support such things. So presumably, IronRuby is going to be easier to write in the CLR than it would be on the JVM because they have already abstracted a lot of the stuff that's required to support dynamic languages on the CLR into this Dynamic Language Runtime.

I think that this idea of the polyglot programming will work in either environment that you run on, and more and more are going to embrace this idea that there is no one single language and we are going to start treating platforms as managed runtimes. The two most important platforms that we are going to talk about over the next decade or so are JVM and CLR. The machine platforms in operating systems don't matter that much anymore. When you have this managed runtime, you can deploy onto a managed runtime and who cares what actually operates the system that is running on? It's a level of virtualization that we have finally gotten to because we have this mature really well engineered platforms like JVM and CLR and we don't really have to worry about physical operating systems anymore because we can just deploy on this managed runtimes. I think that's a good thing, it's one less detail that you have to worry about.

Another huge advantage that Ruby has is that the dynamic language is going to run on every platform. Python is already there and Ruby is going to be there really soon and it already runs on all the operating systems, it runs on the JVM and soon it's going to run on the CLR.

   

5. You were talking about a lot of languages. Aren't you afraid that it will be difficult for developers to learn all these languages?

That's actually a common complaint that people lodge against domain specific languages: this problem of language cacophony, that we are going to have all these different languages we have to learn and be experts in. And that is true, that is an issue, but if you look at Java frameworks, every single one of them comes with its own language embedded inside it, in the form of an XML document. Every one of those XML documents, even though they use the same syntax, use all different languages because every one of them has its own grammar. There is no way that you can download Struts and start using it right away until you learn the Struts configuration language, which uses its own grammar.

It's almost like having languages that use the same character set like English and French, but they are completely different languages, they have different meanings for even the words they have in common. We are already fighting this battle in the Java world, because every single framework you can figure with one or more XML documents, all of which have their own dialect. We can actually cut down on the number of languages that we need to talk about because we can cut down the number of frameworks that we have to use to actually produce work. There is a funny – almost a cartoon kind of – ad for Ruby on Rails that says "To learn to do Java web development you need like fifteen books, because you need the JVM, you need the Struts book, Hibernate book, String book and all these have their own dialects.

For Ruby on Rails you need two books: the Ruby book and the Rails book and that's all you actually need, even if you throw something else to the mix, like Scala. Scala is not a language that has been mashed into the XML syntax, it has its own language. I think what we are going to have are people specialized more in special purpose languages. There will be people that can write Scala code, not everybody on the project will have to do that, but, just like you consume jar files now that someone else wrote, you do the exact same thing for that code that people write. That's actually very similar and related to this rise of domain specific languages which is one of the things I am talking about here, at this conference.

I've been interested in this for a long time, this is a really powerful approach for writing software, because what is does is make simpler abstractions on top of the existing languages. We do this all the time, every time our abstractions get complicated we build simpler abstractions, that sit on top of them and that's what the DSL movement is really about: building these simpler abstractions on languages. That's a really common technique in the Ruby world, because a lot of aspects of the Ruby on Rails are domain specific languages. If you look at interesting behavior driven testing frameworks like RSpec, that's a domain specific language written on top of Ruby.

If you look at Rake, which is the make utility in Ruby, it's a domain specific language written on top of Ruby, really powerful base languages, where you can write simpler abstractions on top of them. It means that you can operate mostly on the simpler abstraction and then dive down one level and get something more powerful, if you want to. One of the reasons I am so interested in this subject is because myself and several of my colleagues are actually writing a book, for Pragmatic Press, on building internal DSLs and Ruby, and we are really investigating a lot of techniques that are commonly used in the Ruby world because it's a very common way to write software in the Ruby world. In fact, you probably have as many DSL driven approaches as you have frameworks in the Ruby world, whereas the Java world is 99% frameworks and 1% everything else.

I think that's a really interesting topic. Martin Fowler, who I did the tutorial with on DSLs, is also investigating and formalizing a lot of knowledge around domain specific languages. I think that's something that people have been talking about for the last couple of years in a slow burn kind of way, but over the next few years you are going to see more and more people talking about that, because it is a really convenient mechanism for building simpler abstractions on top of complicated building blocks. We certainly have lots and lots of complicated building blocks in the software development world right now. Any kind of technique that we can use to write simpler abstractions on top of those is going to be better for everybody because then you don't have to understand all the complicated building blocks and how they work together anymore.

   

6. What are your three favorite IT books of all time?

That's a really tough question. I would have to say probably Pragmatic Programmers – number one. In that book, even if it's a 10 years old, every single page has still got great stuff in it, so it's a classic timeless book. I would say probably the Refactoring Book is a really great book because it codifies a lot of the stuff that you normally think about and the kind of things that you do. It was revolutionary in its time, it put nomenclature to the kind of things that a lot of really smart people were already doing in code and it made it part of the standard vocabulary when you write code. And the other one, I would say it's an interesting choice, but it's a fascinating book –

Smalltalk Best Practice Patterns by Kent Beck. Even if you don't know Smalltalk, it's really interesting to be a tourist in another language world for a while. We think that we keep discovering new cool stuff in Java, but the Smalltalk guys figured all that stuff out 20 years ago. There was just not enough tribal knowledge transfer between the Smalltalk community and the Java community, so we keep reinventing the same stuff over and over again. If you read something like Smalltalk Best Practice Patterns, it has a lot of good advice in it that you can do Java development or C# development or the kind of development that you do today, because they encountered this problem and they came up with a really elegant solution to it.

I think that one of the problems that we have in software is that there is not enough knowledge of history, of what came before. Everybody thinks "Old technology is just old technology, we shouldn't care about that", but even with old technology they figured out some really cool stuff and we should take advantage of that and have more of a sense of history, of where we came from, and some of the cool ideas they had before so we don't have to keep reinventing all these cool ideas every single time we get a new language or technology.

Aug 24, 2008

BT