At QCon New York 2019, Andjelko Iharos, director of engineering at HAProxy Technologies, presented how the HAProxy Technologies CTO Willy Tarreau and the team implemented an EBtree data structure to optimize performance and memory usage of the HAProxy load balancer scheduler.
HAProxy is an open source load balancer providing a range of load balancing methods, custom routing, ACL support, and Lua scripting. HAProxy is designed to handle high traffic websites, and its development has continually iterated on improving load balancing performance and features.
A key component of the HAProxy load balancer is the scheduler, which maintains queues of running and suspended processing tasks for the handling of active connections. The scheduler also manages task insertion and deletion while allowing for identical expiration times, and sorts task reads based on task expiration time and priority.
HAProxy Scheduler, from Andjelko Iharos's QCON Presentation Slides
Due to the high traffic environments HAProxy is designed to handle, Iharos highlighted that the scheduler needed to handle a high frequency of events, a large number of queue entries, frequent lookups, and a wide range in the rate of entry change. When designing the scheduler, Iharos noted the desired qualities where speed, predictability, and simplicity.
With these requirements in hand, Tarreau began the implementation by reviewing how common data structures would perform against the desired characteristics. An early build of the scheduler used linked lists because they offer quick iteration and constant time insert. However, the benefits of linked linked were lost when the scheduler needed to prioritize tasks by their expiration time. Prioritization required task lists to be sorted after every insert and so the constant time insert benefit was lost.
To better meet the prioritization requirements, tree data structures that better supported sorting events were evaluated. Prefix trees, where nodes share a common prefix, provided several features the scheduler could leverage. Prefix trees offer quick comparison of nodes even with long keys and naturally enable prefix matching. However, memory management is more complex relative to other trees and prefix trees can more easily become imbalanced.
To compensate these downsides, Tarreau designed an Elastic Binary tree (EBtree) data structure. The EBtree is optimized for frequent inserts and deletes without additional memory management. EBtree elements are nodes and leaves where the nodes maintain the relationship between upper and lower nodes and leaves while leaves store the key data and a pointer to its upper node. As new data is inserted, a combined node and leaf structure is allocated, the pointers of the nodes and leaves are adjusted, and new nodes can be inserted between an existing node and leaf pair. Due to the nature of the pointer readjustment algorithms, memory allocated for nodes doesn’t need to be accounted for separately, and is released at the same time as leaf memory. This eliminates the need for additional memory allocation and cleanup on insert and delete operations.
EBTree Data Structure, from Andjelko Iharos's QCON Presentation Slides
EBtrees were implemented to manage suspended and active tasks in the HAProxy scheduler and have been extended for use cases including timers, ACL, and LRU caches. The EBtree performance in the HAProxy scheduler has supported inserts down to 100 nanoseconds for more than 200,000 TCP connections per second and 350,000 HTTP requests per second while using up only 3-5% CPU.
Tarreau covers the EBtree implementation in detail in his blog. You can access the slides from Iharos’s presentation on the QCON website.