The Cloud Native Computing Foundation (CNCF) announced the class of 2020 graduation of the open source registry, Harbor.
Harbor is a registry that stores OCI-based artifacts securely using policies and RBAC, scans images for vulnerabilities, and provides signed trusted images. Backed by a Redis Key-Value data store, Harbor provides foundational registry services such as replication to other cloud-based registries and access via consumers such as the Docker CLI, notary client and so on. A more detailed architecture for Harbor is shown below.
(image courtesy: Architecture Overview of Harbor)
InfoQ caught up with Harbor maintainer Michael Michael, also director of product management at VMware, regarding the graduation of Harbor as a CNCF project.
Michael talks about the origin of the project, the challenges behind ensuring diversity amongst maintainers, support for OCI artifacts in Harbor, and differences between Harbor and other cloud-based registries. He also talks about the pluggable security layer and the roadmap for Harbor.
InfoQ: First things first, let’s explore the journey of Harbor from its conception in 2014 to graduation in 2020. Can you talk about that journey and the challenges that had to be overcome to get to graduation?
Michael Michael: Harbor was originally developed in 2014 by a team at VMware out of a need for a private, enterprise-grade container registry that would enable VMware to assess the security posture of images and enforce compliance. Over time, it grew into a popular open source container image registry that is able to secure images with role-based access control, scan images for vulnerabilities, and sign images for authenticity and traceability.
When Harbor was initially started within VMware's China R&D organization, it was leveraged for a handful of internal projects to manage container images. To allow more developers in the community to use and contribute to the project, VMware open sourced Harbor in March of 2016. We have seen a steady gain of users and contributors since, with big inflection points in July 2018 when Harbor was donated to the CNCF.
Just a few months after donation, Harbor advanced to the next tier in CNCF and became an incubating project.
Fast forward to today, and Harbor has a rich ecosystem of users, contributors, and community. The graduation of Harbor in CNCF is a testament of the hard work from our community and the importance of Harbor in the ecosystem. We have twelve thousand stars on github, multiple major releases including the big v2.0 release, and multiple organizations that both made a bet on Harbor, but are also contributing to Harbor’s success.
One of the big challenges we had to solve along the way is ensuring we have diversity in maintainers and contributing companies. We are proud of the 13 maintainers (from five different companies) that are driving the technical direction of Harbor, their efforts further amplified by the 200+ committers from 83 contributing companies.
InfoQ: Harbor is mainly for OCI artifacts, correct? From a container development perspective, what exactly does this mean?
Michael: Pretty much, but we also continue to support Helm charts through Charts Museum. Harbor initially started supporting container images and Helm charts. Today, Harbor supports any OCI compliant artifact, extending all the familiar operations and key benefits of Harbor to OCI.
OCI is a tried-and-true industry standard that defines specifications around format, runtime, and the distribution of cloud-native artifacts. Most users are familiar with some of the more popular OCI-compliant artifacts, like docker images and Helm charts. The OCI specification helps bring artifact authors and registry vendors together behind a common standard. As a developer, I can now adopt the OCI standard for my artifacts and be confident that I can use an OCI-compliant registry like Harbor with minimal to no changes. This means I can push and pull any OCI compliant artifact to Harbor and i get to take advantage of Harbor’s policies like RBAC, authentication, quotas, retention policies, garbage collection, and scanning where applicable.
InfoQ: How does Harbor differ from other registries, like Docker Hub, ACR, ECR, GCR and so on?
Michael: The main difference from other registries is that Harbor is a packaged offering. Users can install Harbor on the infrastructure of their choice, onprem or in the cloud. As an organization, you are in charge of the deployment, operations, and usage of Harbor. This is a big benefit for some users who can’t take advantage of the public cloud due to regulatory, compliance, or data access reasons. Furthermore, having a registry colocated with your compute cluster (for example Kubernetes) reduces latency and increases reliability in image transfers.
Another big benefit is extensibility. Harbor offers a huge ecosystem of pluggable options, from replication adapters, scanning adapters, and in a couple of months P2P adapters.
InfoQ: Can you talk about the image scanning layer specifically? There is a choice of using different scanners. Can users pick and choose this and if so, why and when?
Michael: Yes absolutely. About a year ago, we started down a path of creating an out-of-tree plugin architecture for scanning. Until that time, CoreOS Clair was the only static analysis scanner available with Harbor. The plugin architecture would enable security vendors to implement a set of interfaces and be able to perform static analysis on images stored in Harbor and produce the results that Harbor’s policy would use to satisfy user intent.
Aqua, Anchore, and DoSec were part of the initial set of security vendors that created pluggable scanners for Harbor. Since the initial release of the pluggable architecture, we have also been working with Sysdig on a pluggable scanner that will be released very soon.
In fact, with Harbor v2.0, we have replaced Clair with Aqua’s Trivy as the default image scanner. Trivy takes container image scanning to higher levels of usability and performance than ever before. Since adding support for Trivy through our pluggable scanning framework in Harbor v1.10, we have received great feedback and have seen increasing traction among the Harbor community, making Trivy the perfect complement to Harbor.
InfoQ: Can you talk about other details of Harbor that developers/architects should care about and the roadmap for Harbor?
Michael: I encourage everyone to view the webinar "Harbor, the trusted cloud native registry for Kubernetes”. This webinar goes through the key benefits of Harbor and includes a powerful demo that demonstrates the extensibility and policy engine of Harbor. We encourage users to attend our community meetings or engage with us on Slack or Twitter. If you want to learn what’s coming up next in Harbor, or want to find ways to contribute, start with our public roadmap.
Further technical details are in the GitHub repository. The project main page has more information on the project as well as including an install guide.