Netflix engineers recently published how they use Elasticsearch Percolate Queries to "reverse search" entities in a connected graph. Reverse search means that instead of searching for documents that match a query, they search for queries that match a document, powering dynamic subscription and notification scenarios where there is no direct association between the subscriber and the subscribed entities.
Netflix's engineers Ricky Gardiner, Alex Hutter, and Katie Lefevre describe a core scenario in which an employee wishes to receive update notifications on various events related to a subset of movies based on dynamic criteria like "movies shooting in Mexico City which don't have a key role assigned":
Tiffany [the employee] is not subscribing to updates of particular movies, but subscribing to queries that return a dynamic subset of movies. This poses an issue for those of us responsible for sending her those notifications. When a movie changes, we don't know who to notify, since there's no association between employees and the movies they're interested in.
A naive solution would be to repeatedly execute all saved search queries for all change events, thus determining the relevant subscribers. However, since fetching the data in Netflix Federated Graph has heavy traffic implications, this solution would force the team to choose between timely notifications or less load on the graph.
Instead, Netflix's engineers implemented this functionality using Percolate Queries. A Percolate Query is a specialized query mechanism within Elasticsearch and OpenSearch that allows users to index queries themselves and later match these queries against incoming documents.
Netflix's Reverse Search (source)
To allow users to use this functionality, the Netflix team added a new ReverseSearch Domain Graph Service (DGS). Through this DGS, they expose a new SavedSearch entity:
type SavedSearch {
id: ID!
filter: String
index: SearchIndex!
}
Written in Graph Search DSL, this filter is converted to an Elasticsearch query and indexed in a percolator field. Once a change event happens, the index is evaluated with a percolate query and relevant subscribers are determined from the saved queries that match the changed document.
Netflix faced significant challenges with versioning in their indexing system as they expanded the capabilities of Graph Search. Specifically, when someone introduced new fields, updating existing search indices to include these new fields became necessary. The existing index did not have mappings for these new fields, making it impossible to filter them without creating a new index version.
Netflix solved this by implementing a dual-pipeline indexing system, where each version of an index had its dedicated pipeline. When changes necessitated a new index mapping, Netflix would create a new version of the Elasticsearch index alongside the existing one. They utilized log-compacted topics from their Data Mesh platform to feed the new pipeline, allowing for reindexing the entire corpus without needing the data sources to resend all past events.
This parallel operation of old and new pipelines ensured continuous service while it populated the new index. Once the backlog was processed and the latest index was ready, Netflix switched to the newer version using Elasticsearch index aliases, which streamlined the transition and minimized disruptions in search functionality.