Uber released Kraken, an open source, peer-to-peer (P2P) Docker registry on March 5. Kraken was designed to provide Docker registry services for large-scale systems, solving challenges like cross-region support, performance bottlenecks, and hybrid cloud environments.
Loosely based on the BitTorrent protocol, Kraken is compatible with the Docker registry API and offers configurable storage backends such as S3 and HDFS. Kraken was first deployed internally at Uber in early 2018 before being made available to the open source community.
Docker images, which run in containers, are composed of layers that contain the changes relative to the layer before it and are associated with a binary large object, or blob, of the image files and executables. Docker registries are server-side applications that store and distribute an image's layers and blobs. Docker offers a free registry at Docker Hub as well as a commercial offering. Outside of Docker, other registries have been created to meet specialized needs such as private hosting or implementing IPFS as a storage back end.
Uber runs large-scale, distributed clusters in a hybrid cloud environment. Despite efforts to improve performance with image caching and database sharding, the Docker registry was unable meet the growing demands of their environment and the Uber team chose to build their own solution.
Since implementation at Uber, Kraken has supported distributions of more than 1 million blobs per day. Under peak production at Uber, Kraken distributes 20,000 100MB to 1GB blobs in under 30 seconds. Per the Kraken documentation, Kraken is capable of distributing Docker images at 50% above the max download speed limit on every host. Additionally, cluster size and image size do not have significant impact on download speed. Future enhancements for Kraken will be focused on additional performance gains with large image sizes, security improvements, and supporting Docker tag mutation.
Kraken's architecture was key to Uber delivering a scalable and highly available registry. Fundamental to the design is a custom P2P network, which contains a limited number of hosts that seed content to a network of agents. Agents in the network form a pseudo-random regular graph with high connectivity and a small diameter, an important factor in the system's download speed. Agents are seeded content from an origin, which stores image blobs on backend storage, connect with peers in the network, and return images for docker pull requests.
Kraken was initially built with BitTorrent, but differences in the problem space lead the Kraken team to build their own P2P driver. However, the team actively reviews the Kraken protocol for the possibility of once again making it compatible with BitTorrent.
Alibaba's Cloud Native Computing Foundation (CNCF) project Dragonfly is also an open source P2P image and file distribution system addressing the challenges of distribution in cloud native applications. Key differences between the two are noted in Kraken's documentation:
Dragonfly cluster has one or a few "supernodes" that coordinates transfer of every 4MB chunk of data in the cluster. While the supernode would be able to make optimal decisions, the throughput of the whole cluster is limited by the processing power of one or a few hosts, and the performance would degrade linearly as either blob size or cluster size increases.
Kraken's tracker only helps orchestrate the connection graph, and leaves negotiation of actual data transfer to individual peers, so Kraken scales better with large blobs. On top of that, Kraken is HA and supports cross cluster replication, both are required for a reliable hybrid cloud setup.
Further information about Kraken can be found on GitHub or by joining the Uber Slack channel.