Part of the upcoming Grafana Tempo 2.0, TraceQL is a query language aiming to make it simple to interactively search and extract traces. This will speed up the process of diagnosing and responding to root causes, says Grafana.
Distributed traces contain a wealth of information that can help you track down bugs, identify root cause, analyze performance, and more. And while tools like auto-instrumentation make it easy to start capturing data, it can be much harder to extract value from that data.
According to Grafana, existing tracing solutions are not flexible enough when it comes to search traces if you do not know exactly which traces you need or if you want to reconstruct the context of a chain of events. That is the reason why TraceQL has been designed from the ground up to work with traces. The following example shows how you can find traces corresponding to database insert operations that took longer than one second to complete:
{ .db.statement =~ "INSERT.*"} | avg(duration) > 1s
TraceQL can select traces using spans, timing, and durations; aggregate data from the spans in a trace; and use structural relationships between spans. A query is built as a set of chained expressions that select or discard spansets, e.g.:
{ .http.status = 200 } | by(.namespace) | count() > 3
It supports attribute fields, expressions including fields, combining spansets and aggregating them, grouping, pipelining, and more. The next example shows how you can filter all traces that crossed two regions in a specific order:
{ .region = "eu-west-0" } >> { .region = "eu-west-1" }
TraceQL is data-type aware, meaning you can express queries in terms of text, integers, and other data types. Additionally, TraceQL supports the new Apache Parquet-compatible storage format in Tempo 2.0. Parquet is a columnar data file format that is supported by a number of databases and analytics tools.
As mentioned, TraceQL will be part of Tempo 2.0, which will be released in the coming weeks, but it can also be previewed in Grafana 9.3.1.