Mesosphere Inc, have released the Mesosphere software development kit (SDK) for creating datacenter services that will run on their Mesos-powered Datacenter Operating System (DCOS). The Mesosphere blog states that the SDK currently supports Java, Go and Python, and services can be packaged to enable cluster-wide installation and execution via the DCOS web UI or command line interface.
The Mesosphere DCOS is an datacenter-scale operating system that enables the running of services and applications across a cluster of machines within a datacenter or cloud. The DCOS combines the Apache Mesos cluster manager with a number of open source and proprietary components, and allows the deployment and management of services through a custom web UI and CLI tooling.
The Mesosphere website states that currently two versions of the DCOS are available: a community edition, which includes a free cloud license and currently runs on Amazon Web Service (with planned support for Microsoft Azure and Google Cloud Platform); and a commercial enterprise edition, which can be run on-premise or on a public cloud, and includes 24x7 support.
A number of existing schedulers, or frameworks in Mesos parlance, are available for running applications, including Mesosphere Marathon for long-running (micro)services, Chronos for running batch workloads, Spark for large scale data processing, Cassandra for data storage, and many others. The Mesosphere website states that the aim of the DCOS is to make the installation and management of these frameworks simpler in comparison with working with a standard Mesos cluster. The release of the Mesosphere SDK aims to lower the barrier to entry for developers to create their own framework or datacenter service.
InfoQ sat down with Michael Hausenblas, datacenter application architect at Mesosphere, and asked questions about Mesos, the DCOS and the new Mesosphere SDK.
InfoQ: Many companies including Mesosphere, Twitter and Apple are building frameworks on top of Mesos. Spark was one of the first frameworks to be created for Mesos. Could you explain to InfoQ readers about that history, and why it's been appealing for framework creators to build on top of Mesos?
Hausenblas: The Mesos story starts 2009 at the UC Berkeley AMPlab where the PhD students Benjamin Hindman, Andy Konwinski, and Matei Zaharia were researching cluster resource sharing and scheduling and they built Mesos. In order to demonstrate how fast one would be able to create a completely new distributed system on top of Mesos, they developed Spark in a mere 1300 lines of code. Spark was a sample application for Mesos written in Ben's parent's ski cabin over a long weekend.
This is creating a new class of 'datacenter developers' who can build distributed systems over the weekend because, rather than re-inventing the wheel, they leverage the cluster resource sharing primitives Mesos provides. This results in little to no networking code and faster development cycles while being able to rely on Mesos for fault tolerance and scalability.
As you probably know, Benjamin Hindman is now Chief Architect at Mesosphere, the commercial entity behind Mesos. Andy Konwinski and Matei Zaharia are co-founders of Databricks, the commercial entity behind Spark. We will see many more businesses forming around this ecosystem.
InfoQ: What is the difference between Apache Mesos and DCOS?
Hausenblas: The Mesosphere DCOS is a new kind of operating system that spans all of the machines in a datacenter or cloud and pools them together so that they behave like one big computer. Mesos, the Apache open source project, is the kernel inside this operating system. This is analogous to how it works in the Linux world. You have the Linux kernel, which isn't much use by itself, and then you have a distro like Ubuntu where they've added all of the system services and tooling around the kernel to make it a complete product. We've done the same for the datacenter. We wrapped the Mesos kernel with a lot of other components, such as an init system (marathon), a file system (HDFS), an application packaging and deployment system, a graphical UI and a CLI. All of these things together comprise DCOS. For a discussion on Mesos vs Mesosphere, see https://mesosphere.com/blog/2015/07/09/watch-matt-trifiro-explain-the-difference-between-mesos-and-mesosphere/
InfoQ: How would you compare writing a datacenter service on top of a datacenter operating system to how developers would write applications to run natively on personal computer operating systems, or mobile apps to mobile OSs?
Hausenblas: It's directly analogous. Back in PC era, we had operating systems like MacOS and windows that were abstracting away the hardware and the multi-core processors, giving app developers a way to quickly write desktop applications using powerful APIs and primitives that made it possible to develop quickly, exploit the underlying hardware, and make it easy for operators (end users) to run dozens of applications simultaneously.
What's different about today, with a datacenter operating system, is that you're building apps to a different form factor. In the datacenter, you are programming against the 40-thousand cores in rows of racks versus the 4 cores in your laptop. But it's really the same thing. You request memory allocation. You spawn new tasks. You abstract the hardware and provide a uniform API for building applications. It's the scale that's different.
To clarify: while the datacenter developer is going mainstream, most people out there will probably not need to write native datacenter services. They can build applications and datacenter workloads that leverage tools such as Marathon. If you have the requirement to actually develop a datacenter service, our FAQ covers that. Other than that, from an applications perspective it's really straight-forward: in the simplest case, if you want to run an existing application, such as a Java application using a JAR file or a Python script, you can do that without any modifications. In a more sophisticated setup, you would build a microservices-oriented application, for example, based on Docker images, service discovery using Mesos-DNS and Marathon as the component that launches and orchestrates your containers.
InfoQ: Tell us about the SDK itself - what types of things does it define?
Hausenblas: The Mesosphere SDK (Software Development Kit) is intended for DCOS developers, allowing them to more easily create a new datacenter services using various languages (Java, Go, Python) and deliver those applications in a package that can be installed with a single command on the CLI. The Mesosphere SDK exposes the Mesos API, but it also provides DCOS extensions and contains libraries (scheduler development, executors, etc.) for building services, a checklist for certification through Alpha, Beta, Production and a developer cookbook. There is also a very robust community growing around the Developer Program, which is an important part of being successful with app development.
InfoQ: What's the ultimate reason for exposing these services as an operating system abstraction. What does it buy for the creator of the framework, or for the ultimate consumer of applications built on that service?
Hausenblas: Machines are the wrong level of abstraction for building and running distributed applications because it forces you to reason about machine-specific details like IP addressing and local storage. By combining all of the machines, virtual or physical, into one pool of resources. The DCOS provides built-in primitives for automatic placement of tasks, service discovery and messaging for task coordination. Because you don't have to write (test and debug) this code yourself, it means your efforts are highly leveraged and you can deliver a better user experience more quickly. By building to the abstraction of a Datacenter Operating System, you can deliver new datacenter-scale products to market quickly. That's the catalyst that operating systems provide. Imagine trying to build and deliver Uber without iOS and Android. It would take years and years and years.
But there is also a business reason for using the abstraction of a datacenter operating system. An operating system provides a platform on which you can distribute applications that work across different infrastructures. This gives developers access to a broad and friction-free market into which they can distribute these apps. We've made it easy for developers to distribute their work, for example, by building into the DCOS a very sophisticated packaging system that allows you to package up a complex distributed system (like Cassandra, Kafka, Spark, Kubernetes and so on) and make it possible for an operator or developer to install, configure and run that system with a single command - in minutes. This is unprecedented. Our CEO, Florian, talks about how it took over three weeks to stand up Cassandra when he was at Airbnb. I just installed Cassandra in 2 minutes with one command.
InfoQ: Where will all of these datacenter services be stored, and how will InfoQ readers get access to them?
Hausenblas: All of these services for the datacenter will be available in a repository that will continually expand and grow. The DCOS repository is a way to package and deliver a large-scale distributed system. It works hand-in-hand with a lower-level container repository like DockerHub (where you store your container images), but the DCOS repository contains all the metadata that allows for the complex installation and provisioning of a distributed system like Hadoop or Kafka.
As new datacenter services are developed, developers will add them to the DCOS repository (there is no charge for this) and then they become to customers and prospective customers via a single command installation. Developers are excited to get frictionless access to a growing ecosystem of DCOS users, from Fortune 500 companies to emerging startups. The DCOS provides a standardized platform for deployment and management of datacenter apps, both on-cloud and on-premises, which has historically been a big hurdle to enterprises adopting the latest technologies.
When you build for the DCOS you are building for the Mesos community. Because the Mesosphere DCOS is built on top of Mesos, your app will have all of the benefits of being part of the open-source Mesos community, but with the turnkey deployment, scaling, and management features of the DCOS. For developers, the DCOS SDK provides community and reusable code. For your customers, it provides easy app discovery and ease of installation.
InfoQ: How would InfoQ readers get access to the SDK?
As a member of the developer program, you are given access to the full suite of DCOS SDK development tools and repositories, as well as access to the growing community of DCOS developers. By joining the developer program, you are eligible to become a VIP partner. As a VIP partner, in addition to the developer program, you are given high-priority access to Mesosphere developers, early access to DCOS Service builds, and the ability to contribute fully-certified DCOS services. The developer program is free and open to everyone. The VIP partner program is invite only, and is available to select partners.
InfoQ: Can you provide some examples of services that have been written to the SDK so far (and how specifically are they using the common services provided by the Mesosphere DCOS)?
Hausenblas: We have a number of key VIP partners building datacenter services for the DCOS, where their services that are available in different stages of alpha, beta or production. These services include Kafka, Kubernetes, HDFS, Hadoop, YARN, MemSQL, Cassandra, ArangoDB, Crate.io, Spark, and Quobyte, among others. These are the highest-demand core services for the DCOS, and we expect to have over 100 by the end of the year. For a continually updated list of DCOS Services, see https://docs.mesosphere.com/reference/servicestatus
InfoQ: Many thanks for your time today Michael. Is there anything else you would like to share with the InfoQ reader?
Hausenblas: Thank you. I think we've covered everything - we're very much looking forward to feedback on the SDK from your readers, and also new datacenter service contributions!
Additional information on the Mesosphere SDK for DCOS can be found on the Mesosphere blog, and registration for access to the SDK via the Mesosphere developer program can be found on the company's website.