Cloudera announced at the end of June that it has released Cloudera Enterprise, a commercial packaging of the open-source project Apache Hadoop and many other supporting projects. The intent behind Cloudera Enterprise is to expand Hadoop's reach further into the corporate world.
Hadoop and its sub-projects like Hive and HDFS have been covered extensively here on InfoQ. For the latest update on Cloudera as a commercial packaging of those projects, InfoQ caught up with Charles Zedlewski, the Senior Director of Product Management at Cloudera:
This is for companies that want to run Hadoop in production. So you've got the first use case to work, yeah it's great Hadoop does everything that people said it would. Now you've got a second use case and a third and a fourth. This is starting to take over fairly serious workloads inside my business....for a typical enterprise their mindset is ah, ok, now this Hadoop will be a part of my datacenter for some time to come. There's a whole bunch of things a company expects from a platform that will live for any protracted period of time. That's really what [Cloudera] Enterprise is for.
We asked Mr. Zedlewski to describe some of these expectations. He suggested that enterprise operators:
...need some consistent behavior out of this platform: I need to be able to tie it to some SLAs. I need it to integrate with and snap into the other tools and technologies I own and fit in with the other IT policies I have. I need to be able to administrate it and it can't be that I have to hire somebody out of Facebook or Google to get it working. It needs to be that with training I can get it working with the people I have on staff today.
Zedlewski went on to describe the various tools for analysts that allow them "to work in the tool they already know but just point it at this big processing engine called Hadoop." The list included current and future Cloudera partners like Karmasphere, Datamere, MicroStrategy, Quest Software , and "a ton more."
Zedlewski's view is that in the future the majority of users will work with Hadoop through these sorts of tools rather than directly:
...at Facebook, Yahoo! and even Twitter, currently 90% of the workload that goes into Hadoop is no longer originating in MapReduce. Instead users originate with higher level things like Pig or Hive which are compiling into MapReduce. MapReduce is like the [operating system] kernel everyone uses but doesn't interact with any more.
However, Mr. Zedlewski went on to explain that there will not be a one-size-fits-all interface to Hadoop, saying, "the idea that there will be one UI that everyone passes through to get to Hadoop would be a big mistake. If you think about databases there's not one UI that everyone passes through to do things with databases and neither will that be true for Hadoop."
Before signing off, we asked Mr. Zedlewski for a preview of what may be coming in advance of Hadoop World in October and he wrapped up with: "our current plan is to do one final Beta update to our distribution for Hadoop, so we'll have a number of significant enhancements to talk about...On the enterprise side, it will be principally a preview."