Pinterest has modernized and enhanced its Goku time-series database. The recent updates focus on optimizing storage and resource usage without compromising service quality.
Pinterest built Goku, an in-house database engine, to address specific limitations in OpenTSDB. You can read a detailed explanation of what motivated Pinterest in this article.
In a recent blog post, the Goku team introduced two features, a metrics namespace and providing top write-heavy metrics, to the client Observability team that helped them reduce the data stored on Goku. These innovations have significantly reduced data storage requirements. The metrics namespace organizes metric configurations, allowing for efficient data management. Meanwhile, the system for identifying top metrics helps the Observability team block unnecessary data, reducing stored time series by 37%.
The namespace configurations (Fig 1) are stored in a dynamic shared config file watched by all hosts in the Goku ecosystem (Fig 2). Any moment the contents of this file change, the Goku process running on the hosts is notified, and it parses the new content to understand the changes.
Several architectural changes further reduce infrastructure costs. Indexing improvements for metric names have cut memory usage from 12 GB to 3 GB per host. Additionally, the introduction of dictionary encoding in the Goku Compactor has eliminated out-of-memory issues, allowing for the use of less expensive hardware.
Pinterest has also focused on optimizing memory allocation. By addressing internal fragmentation and over-allocated memory, the team has achieved significant memory savings. For instance, changes to the folly::IOBuf structure have reduced memory usage by 8–11 GB per host.
Time-series compression algorithms are crucial for efficiently storing and processing large volumes of time-stamped data. These algorithms reduce the data size by identifying patterns and redundancies, allowing for faster query processing and reduced storage costs. Techniques such as delta encoding, delta-of-delta encoding, and XOR-based compression are employed to achieve significant storage efficiencies. For instance, TimescaleDB, an open-source time-series database, uses these algorithms to achieve over 90% storage efficiency, translating into substantial cost savings and improved query performance.
Meta, in its Gorilla time-series database, also leverages compression techniques like delta-of-delta timestamps and XOR’d floating point values to reduce storage footprint by 10x, significantly enhancing query efficiency. More details can be found in their research paper.
Pinterest's efforts are part of a broader trend in the tech industry towards optimizing time series data management systems. Similar initiatives include Apple's FiloDB, Netflix's Atlas, Uber’s M3, Meta Gorilla and Salesforce's Argus. These projects, like Goku, focus on efficient time-series data management and some are available as open-source solutions on platforms like GitHub. They represent a collective move towards more scalable and cost-effective data infrastructure.
These enhancements have led Pinterest to achieve a 40% reduction in time series storage and a 70% decrease in costs. The improvements also allow Pinterest to accommodate a 30% increase in organic storage growth without additional capacity. A relevant discussion on observability spending and efficiency can be found in this Reddit thread, where industry professionals share insights and benchmarks. For example, serverlessmom on Reddit, stated:
To level set: in many enterprises Observability is the #2 costs after actual infra/hosting. People are paying a lot to know what their software is doing.