Twitter recently announced open sourcing an anomaly detection package in R. Anomaly detection is a major study field as it can denote different things. A major spike in followers or favorites around a topic can happen because something major is happening and this may be something that needs to be broadcast around the network. But this same spike can also happen because of bots and spammers, which means that to the contrary, action has to be taken to contain this activity.
Late last year, Twitter open sourced BreakoutDetection, an open source R package that makes breakout detection simple and fast. A breakout is characterized in this package by two steady states and an intermediate transition period. The transition between these two states may have occurred suddenly or gradually. This package can make sure that from a statistical standpoint and in the presence of anomalies, a breakout is sure to have occurred. Using the E-Divisive with Medians algorithm, the package can detect one or multiple breakouts in a given time series.
In contrast to breakout detection, anomaly detection is refetting to point-in-time anomalous data points. An anomaly can be global or local. A local anomaly is one that occurs inside a seasonal pattern, for example an extra five percent boost that can happen within the normal Christmas period boost that happens in activity. These are harder to detect than global anomalies, which typically extend above and beyond expected seasonality trends.
Another dimension in anomaly detection is positive and negative whereas positive anomaly may mean more tweets happening during Super Bowl and negative anomaly like less transactions per second may denote hardware and infrastructure issues.
The AnomalyDetection package works using the Generalized ESD test. It can detect both global and local, positive and negative anomalies. The code is available on GitHub.