Introduced as a public preview at AWS re:invent 2018, Amazon Managed Streaming for Kafka (MSK) is now generally available. Amazon MSK aims to make it easy to build and run streaming applications based on Kafka.
The new MSK GA extends support for Kafka to version 2.1.0 while maintaining full compatibility with Kafka 1.1.1 for applications created during the preview period. Additionally, Amazon added a number of new features based on early customer feedback. Those include support for TLS encryption-in-transit between clients and brokers and between brokers; integration with AWS CloudTrail for logging; and the possibility to define IAM policies based on tags assigned to clusters at the time of their creation. At the moment, the only way to provision MSK in your AWS cloud environment is using the AWS management console and CLI. However, Amazon is working on adding AWS CloudFormation support to enable MSK modeling and provisioning using a JSON or YAML textual description.
Amazon's effort to simplify Kafka integration goes beyond the inherent difficulty in setting up, scale, and manage a self-hosted Kafka server. Indeed, its MSK service also includes an Apache Zookeeper node, which Amazon will not charge for, to ensure high availability and security. As Amazon product manager for data streaming Damien Wylie stated,
We are going to detect that failure automatically and then reintroduce a new node. Hence the IPs remain intact, and finally, any patches that are required throughout the time you are running the cluster we automatically apply those for you.
The process to create a Kafka cluster in AWS management console is extremely streamlined. You only need to choose the Kafka version you want to use, decide on how many brokers for each availability zone you want, and set encryption and storage options, with most of the settings providing a reasonable default value. If you want to change the broker instance type or the Amazon EBS volume size, you can do that by accessing the advanced custom settings section.
Once you have your Apache Kafka cluster in place, you can set up any number of topics for producers to use when sending messages. Consumers will be streamed all messages for topics they are listening on. All of those tasks can be carried through using standard Apache Kafka tools.
As a last note, it is worth mentioning that Amazon is offering a service level agreement for MSK ensuring 99.9% availability.
Apache Kafka was originally developed at LinkedIn, later open sourced in 2011. Amazon is not the only provider of a managed Kafka service. In particular, Confluent, company founded by the LinkedIn team that originally developed Kafka and that focuses on building a platform on top of Kafka, recently launched a managed Kafka service called Confluent Cloud.
If you want to get started with Amazon MKS, the best place to start is Amazon step-by-step guide.