InfoQ: Can you give us a quick overview of JCR/JSR-170?
David Nuescheler (DN): I think this boils down to explaining the feature set of a content repository.
Generally in a one sentence description I always describe a Content Repository as the “best of both worlds” from “relational databases” and “filesystems” plus all the good stuff that we always missed and had to build into our own applications.
This includes things like transactions, scalability, query on the DB side, being really good with very large files, streaming, access controls and hierarchies on the filesystem side and things like versioning, fulltext search and most importantly a “data first” approach which neither of the two support.
JCR is a Java API describing all these features.
InfoQ: How do you feel about JCR adoption in the market today?
DN: When it comes to adoption I always think that it is important to think of both sides, the implementors and the users of an API.
First of all I am excited that I am aware of over a dozen of repositories that are compliant with JCR v1.0 (aka JSR-170) only two years after the initial release of JCR. As expected some repositories are compliant through third party connectors but the majority already shipping JCR compliance out of the box. This includes the major repository vendors as well as young and innovative new repositories. Another very encouraging fact is that there are already four open source implementations of JCR which I think shows the great support of the specification from an implementation side of things and is definitely comparable with the most widely adopted specs in the Java space.
Much more importantly, the adoption amongst users of the API has been fantastic, and in my view this is the much more important part of the adoption. In many ways this is what really fuels the repository consolidation that everybody was hoping for. Every single new content management initiative can nowadays rely on a standard based ready to use content repository instead of building yet another “their own” content repository from scratch. This means for applications developers that they can rely on features like access control, full fledged versioning or full text search right from the start of their project. We see a great deal of new applications built on top of Apache Jackrabbit on a weekly basis as the most accessible content repository implementation and I think you cannot overstate the fact that for each and every of those applications people would probably have built a new proprietary content repository on top of something like hibernate in the absence of JCR.
InfoQ: How does your employer, Day, monetize this?
DN: When we started the specification process of JCR back in 2001 we did that because there was no industry standard and we considered ourselves primarily an application vendor that had to come up with their own proprietary content repository. Which we did. But it was very unsatisfactory for our customers to have all their most valuable content locked into a proprietary silo, so we wanted to make sure that we offer an open platform for all our content applications (which include Web Content Management, Digital Asset Management and Social Collaboration software) that is based on a standard. Given the absence of such a standard we owed it to our customers to do something about that.
So generally, being open and standards compliant is a key selling feature of our content applications.
In the meantime through our involvement in JCR and our open source activities around Apache Jackrabbit we started to sell our highly scalable and enterprise ready commercial fully compliant content repository called CRX.
CRX is a very surprising product in many different ways, for example it allows the entire fine-grained content in the repository to be persisted in simple old fashioned tar files in a transactional way that in our loadtests outperforms the traditional RDBMS-backed persistence layers by factors.
InfoQ: There has recently been a lot of buzz about Atom and AtomPub, and even some claims that they obsolete JCR. What are your thoughts on this?
DN: Frankly, I still have a hard time understanding why people think that protocols and API’s are competing. I remember back in the days when people compared WebDAV and JCR (which btw I think is a much more appropriate comparison from a feature set perspective) we drew the comparison to HTTP and the Servlet API. You wouldn’t see anybody saying things like “Now I have HTTP, why do I need the Servlet API”. Programmers use API’s not protocols.
Now comparing Atom and AtomPub to JCR is a bit of joke from a functional perspective. While Atom and AtomPub offer reading and writing JCR certainly goes beyond that in many different ways (Search, Locking, Versioning, Access Control, …). From a technology perspective I think Atom and AtomPub are a more light-weight (probably needed) replacement for the WebDAV collections handling, but that’s about it.
InfoQ: But isn’t the common problem with API standards that they tie the solution to a particular programming language (in this case, Java)? Wouldn’t I want my content to be accesible from other platforms, too?
DN: I don’t think of it as a problem. It is a feature. Protocols as mentioned allow access in a “platform neutral way”, unfortunately that's entirely pointless for the developer, since they will have to come up with their own API’s wrapping the parser. As mentioned the servlet API is tied into a java platform and makes http accessible to Java programmers. Now saying that the Servlet API has the “common problem” that it is tied to a particular language would not be a fair statement. In the world of content repositories (opposed to the example with HTTP and Servlets) we lack a widely adopted content repository protocol. WebDAV and Atom are probably the best candidates but I am confident that there will be more widely adopted specs on a protocol level in the future. Personally, I don’t see how the lack of a good, well adopted protocol spec could be considered a flaw in the API spec.
I would like to mention that JCR has been ported to a large variety of language environment including .NET, PHP, Perl and amongst other also JavaScript.
InfoQ: Given that REST inventor Roy Fielding is Day’s chief scientist, what’s your opinion on REST?
DN: I consider myself a “child of the web”, in the sense that I started to use the web fairly early on and learned to understand and love the architecture of web before I learned traditional application development techniques. So I never ran into the usual problem that app developers have, trying to make the web fit into their stateful application development paradigms. I came from the other side and had to make apps fit into my web-world.
So when I first met Roy I realized that we already built our applications according to a restful architecture without even knowing about or formally referring to REST. It just felt like the only natural architectural style at the time and of course it still feels that way. So count me in as a “full on RESTafarian” or whatever the nickname of the season is, or let’s just say I am a “web guy” in contrast to an “app guy”. Of course the formalization of the REST architectural style in Roy’s dissertation in 2001 is a very important asset for the web community and I am amazed how long it took the general public to appreciate its value.
InfoQ: How, if at all, does JCR relate to REST?
DN: In my mind JCR and REST are related in various different ways. First of all they both are information centric and support hierarchical addressing of the information. So the JCR paths map very intuitively to URLs just like paths in filesystems. One of the first exercises we went through was to map all the JCR API calls to WebDAV to offer a complete remoting of the JCR API in a RESTFul manner.
This was not only important to be sure that we are aligned with WebDAV from a feature perspective but also manifest that we did not violate any of the constraints imposed by the REST architecture style.
For static Filesystem based websites there is a natural mapping from a URL to the Filesystem. If I look at the URL, I pretty much know what’s going on. This is due to the natural and intuitive mapping of the hierarchic filesystem path to the URL. A content repository brings back a store that supports a hierarchy and allows for this very intuitive mapping.
In my mind JCR is the ideal information store for web centric, REST oriented applications.
InfoQ: Can you give us a little background on Apache Sling? Why does the world need another Java web framework?
DN: Playing devil’s advocate I might assume the position that I have not seen a java “web” framework yet.
I would argue that the world is full of Java “application” frameworks that sort of expose their services to the “web”, but really I would not call them “web frameworks”.
Generally I think there is a void for a framework that does not consider the stateless architecture of the web as a necessary evil or does pay more attention to the URL than “it’s just a string”. Sling really is not another “Java Application Framework” and does not try to be one. Sling is the first “Web Framework” that I have seen that not just respects the fundamental principals and constraints of the web but actually “feels good” about them.
We like the web the way it is and don’t try to circumvent then basic principals as soon as possible by injecting sessions or continuations. One of them biggest differentiators that I have seen comparing Sling to existing application frameworks is its relationship to the URL. In legacy application frameworks like J2EE or struts, the URL was mainly used to address your scripts or controllers (.jsp, .php, .do) and pass in parameters to execute certain operations. So one would end up with ugly URLs like …/view.jsp?id=123465 that are clearly not even resource oriented. In more modern frameworks people started to be more flexible with URLs just treating the URL “as a string” allowing for example regexp based dissection of the URL.
While this allows for a more resource oriented or even a RESTful architecture, it does by no means lead people to do the right thing. It just gives them more choices and in most cases choices or let’s say the lack of a good default behavior is bad.
Sling is very different, since Sling is backed by a content repository it exposes a hierarchic namespace that is mapped very naturally to the hierarchic namespace exposed through a URL, while this still gives people full control over the URL it also gives them very clear guidance one how to expose their content nodes as resources in a well designed manner. Sling puts the “web” back into “web framework”.
InfoQ: I have to admit I haven’t seen the argument about non-hierarchical URLs being unRESTful before. How well does Sling do in terms of other REST aspects, such as hypermedia?
DN: The hierarchical aspects of the URL has nothing to do with RESTful-ness per se. Back in the days when websites were still file system based it was reasonably apparent that the URL /news/newsitem.html was an “newsitem.html” file sitting somewhere on a filesystem in a “news” folder. adding an archive folder or another news item was very straight-forward because the mapping of the hierarchical URL was very transparent. While of course the web servers give you all sorts of possibilities to do crazy mappings, the simple case is solved in a very simple and intuitive way. Sling adopts that very same concept where by default that path of a URL is mapped to a node in the content repository. Of course sling allows you to take partial or full control of the URL, but I think it is of great importance to provide a simple, intuitive and very powerful default mapping that is proven and scalable and adheres to best “web” practices. Just providing the ability to a developer to go in an map out their own URL space is not useful at all since this will be the first thing that I will have to come up with when I start a project and every single developer will come up with a completely different mapping.
Generally, I would argue that Sling does very well on encouraging all four constraints that make up the REST architecture. This may be the most important differentiator, while other application frameworks may allow you to build RESTful applications if you really spend the time, Sling is “built” to develop RESTful repository backed web apps right out of the box and makes it hard to ignore the REST architecture.
InfoQ: Thanks for your time!
David Nuescheler is responsible for the technology strategy and ongoing product development at Day Software in Switzerland. He is the specification lead on JSR 170 and JSR 283, Content Repository for Java Technology API. His group has been working for over 4 years to standardize the content repository market. David is also a committer on the Apache Jackrabbit Project and a member of the Apache Software Foundation.