While modern tooling for Android and iOS enable memory leak detection using local builds, this is not enough to guarantee an app shows correct memory behavior in production, where it runs on a wide range of devices in diverse conditions. For this reason, Lyft engineers combine A/B testing and memory observability to detect which features leak memory.
When a large and complex feature is released, it is important to make sure it does not bring any regressions in terms of memory usage. This is especially important if the feature includes native C/C++ code which has a higher chance of introducing memory leaks.
The general approach Lyft engineers follow consists in measuring a baseline memory behavior of a standard version of their app and comparing it with the data collected for a subset of their user base that received a specific feature in an A/B test.
To profile the overall memory used by their app, Lyft engineers considered a few APIs available on Android to retrieve memory usage metrics: proportional set size (PSS), unique set size (USS), and resident memory size (RSS). They finally set for RSS given its characteristics in terms of performance and the possibility of using it without incurring in any sample rate limit. Other metrics that Lyft engineers are interested in concern JVM heap and native heap size, whose collection is directly supported by the Android runtime.
Memory metrics are collected in two significant scenarios: each time a UI screen closes and at 1-minute intervals to account for the possibility of users remaining a long time on the same UI screen.
As mentioned, to understand whether a feature introduces a memory regression, Lyft engineers compare memory behavior for a group of users running a stock version of the app (with the feature disabled) and for a group of users running a version of the app with the feature enabled. Memory usage will vary across users, so in both cases the memory footprint of the app will be represented by a curve. If the memory footprint curves differ for the two cases, as shown in the picture below, this is a hint to a memory regression.
A particular interesting case is that of a memory leak affecting only a small portion of the users. In this case, the memory footprint curves will be almost the same, except for the higher percentiles, depending on the size of the affected user base. This case is shown in the following picture for a memory leak occurring in very specific circumstances and affecting only the last percentile of the user base.
According to Lyft engineers, the approach described in their article proved effective to detect memory leaks, specifically for leaks happening in very specific circumstances that are easy to miss when testing locally. Do not miss the original article for the full details.