BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Grafana Labs Announces GA of Cortex v1.0 and Discusses Architectural Changes

Grafana Labs Announces GA of Cortex v1.0 and Discusses Architectural Changes

This item in japanese

Grafana Labs, the company behind the popular open-source monitoring projects Grafana and Loki, announced the General Availability of Cortex v1.0. Cortex is a clustered Prometheus implementation that includes features such as horizontal scalability, multi-tenancy, durability, and long-term storage.

Originally started in June of 2016, Cortex is an Apache-licensed, open-source project and is part of the Cloud Native Computing Foundation (CNCF) Sandbox. It has been in production for three years as the backend for the Grafana’s managed logging and metrics platform Grafana Cloud. Grafana Cloud operates on the order of tens of millions of time series on their clusters. When asked why Cortex is going GA now, Tom Wilkie -- VP of product for Grafana Labs and Creator of Cortex -- said:

If you’ve been running Cortex over the last three years, you’ve probably been one of the maintainers. The real push here is that we think it’s stable. We think it’s easy to use, and we think it’s ready for more wide adoption.

Prometheus is a real-time metrics and alerting system in a time series database. It was the second project accepted into the CNCF -- after Kubernetes -- and has become the de facto standard for monitoring Kubernetes today. Prometheus adoption has seen a 15x growth since its inception and has over 250,000 active instances.

One of the challenges of running Prometheus in larger organizations is that installations can often be very dispersed with many teams running their own Prometheus instances. Thanos, a similar open-source CNCF sandbox project that operates in this space, excels at keeping data closer to those Prometheuses and federating queries to them. Cortex, on the other hand, takes a more centralized approach and allows organizations to have a common observability team that provides more of an internal Prometheus service to an organization.

In addition to its managed cloud offering, Grafana offers enterprise support subscriptions for organizations running Cortex on-premise. Wilke says: Cortex has really hit the mainstream this past year helping enterprises adopt Prometheus, and leading the charge to deliver a scalable solution with blazing fast Prometheus metrics.

It is this experience helping enterprises adopt Cortex that has led to improved stability, documentation, and a reduction in the complexity of the system.

Early complaints when installing Cortex involved the complexity required in running its microservice architecture and a lack of support for block storage. Both issues have been addressed in the GA release.

Cortex can now be installed as independently scalable microservices or as a single process. Gone is the hard dependency on installing, running, and orchestrating 15 different microservices for a simple installation. Cortex can be easily installed as a single process -- a single binary -- on your laptop with one command.

In addition, when Cortex started, it started with a dependency on a NoSQL storage engine, such as Apache Cassandra. This storage has allowed Grafana to run some of the largest Prometheus installations in the world. However, that dependency on NoSQL also brought along operational costs that proved a stumbling block for many enterprises. Therefore, Cortex can also be run in a mode that only needs an object store (such as S3, GCS, or Azure Blob Store).

Working together with the team building Thanos, the Cortex team has contributed improvements to Thanos and shared the code base; things like query caching, parallelization, and sharding have all been added. Both Cortex and Thanos are now seeing comparable query performance between block and NoSQL storage with this collaboration. Wilke said, “Cortex is now easy to operate, well documented and comes with all the playbooks, dashboards, and alerts you need for real production operation and it’s cheap to run because you can now use a block store and get those kinds of total cost of ownership advantages.”

Cortex v1.0 is available immediately and you can find more information on the project online.

Rate this Article

Adoption
Style

BT