The Hazards of Retention

Retention is a pretty standard way to measure the engagement of users with an app, but is often misunderstood, so I’d like to take this opportunity to make some observations about retention, and in particular debunk some commonly held perceptions of retention statistics.

We’ll kick off first with a definition. Retention is usually defined as the percentage of users that return to your app a period of time after their first session. It’s usually given in terms of days, so day-7 retention is the percentage of users that return to the app 7 days after first trying it.  This sounds simple right? How many users return to my app after being away for a bit.

Well actually (and unfortunately): no. That definition says nothing about what the users do during the period of time.

Interesting Fact #1: when measuring day-n retention, we don’t care what users do for all other days up to n.

There are in fact 2 very different definitions of retention, often calling rolling vs absolute retention. With absolute retention, we care about the users who have returned to the application on the day concerned. With rolling retention we account for all the users who’ve returned to the app at any time up to the day in question. You can hopefully see that absolute retained users are a strict subset of the rolling retained users.

Some analytics vendors have always used rolling retention, which causes great confusion when comparing metrics between vendors. But the industry norm is the use the slightly more useful (and conservative) absolute retention. For day-7 retention, we look at a cohort of users who’ve had their first session on a particular day, and report the percentage of those users that had at least 1 session during the day 7 days from then, disregarding whether they had a session at any time in between (or after). It’s effectively a sample of user behavior, taken at specific intervals.

Interesting Fact #2: a retention statistic doesn’t give a good measure of the recurring behavior of your users.

If you think about it, retention statistics are 1-shot. You get to measure the statistic once per user or per day cohort (meaning all the users who installed on a specific day). You could take a day-7 measurement, and then a day-14, and then a day-21 etc, but you’ve no idea if a user who was “day-7 retained” was also “day-14 retained”, so these measures are not directly connected, other than being samples drawn from the same set of users.

To say that a user cohort has a tendency to return on a weekly basis to the app, you need to use more information than this.

Interesting Fact #3: the 40:20:10 rule isn’t global

You may have heard of the holy trinity of app retention numbers: day-1, day-7 and day-30 retention statistics, with 40%, 20% and 10% being often quoted as the ‘targets’ for these respectively. For example, with a day-1 retention of 40% we’re saying 40% of our users who installed the app yesterday, returned for a session today. Here’s an app that exhibits close to this behavior:

 

The reality is this applies to certain types of apps, and in particular to entertainment applications, where the lifetime of a user is usually anticipated to be short, or where there’s a specific amount of content that a user can consume in the app. But many apps have more cyclic behavior patterns, with strong weekly retention patterns, but relatively lower daily patterns.

Interesting Fact #4: retention numbers can have periodic variation

Here’s an example of an app which show strong periodic behavior of the day-1 retention.  You’ll see this a lot in apps that are expected to be used daily. In this case (an entertainment app), the troughs in day-1 retention always happen on Saturday. And if you think about it, this makes sense: at weekends most folks do other things, and entertainment apps get most usage during the week, when travelling to work, or during lunch breaks.

Therefore Saturday is the day where you are less likely to return on the next day (i.e. Sunday), in comparison to a weekday. When looking at your retention numbers, make sure to account for this variation; in this example it varies from about 22% on Saturday to 26% mid-week. That’s nearly a 20% variation in the metric, from peak to trough.

 

Interesting Fact #5: retention numbers are reported in the past

If you think about it for a minute, you’ll realise that we don’t know the day-30 retention stat for January 1st until January the 31st, when we’re finally in a position to count the number of users who installed the app on January the 1st and then subsequently had a session 30 days later. The number is normally reported in a graph on January 1st, even though that’s now 30 days ago? Why do this? Well it allows us to compare a given set of users’ day-n statistics.

 

If you look at the example above, on a given date, say the 20th of the first month, we can read the day-1, the day-7 and the day-30 statistics for the users that installed the app on that day (i.e. the 20th).  We can compare like with like, with data that is collected at different points in time. You can also see what happens when we get close to “today”, which in the example above is the 23rd.  On the 21st on the right, we know the users day-1 retention, but we don’t have day-7 or day-30 numbers because sufficient time has not elapsed yet in order to make those measurements.

As an added bonus, if you check out the graph prior to this one you’ll spot a day-30 retention bump on the 2nd, which means that on the 1st of the subsequent month, there must have been a drop in the number of users returning to the app that day. Turns out, that particular app had an issue on the 1st of the month, with some users experiencing difficulty with the app (nothing to do with Swrve I should point out!), thus the reduced number of retained users on that day, and subsequent impact on the day-30 retention numbers, 30 days previous to that.

Interesting Fact #6: there is a better way

I say better, and really I should probably say “different but more easily interpreted, and potentially more useful”.  Survival Analysis is an alternative way of viewing the retention of users in an app. In survival analysis, we plot, by elapsed user time in app, the percentage of users that are still active in the app.

This is guaranteed, at any given point in time, to be a strictly monotonically decreasing graph.  It plots, for each elapsed day, the percentage of users still active on that day, independent of the day they started.  The graph starts at day-0, the day of install for all users, and counts users who “survive” until the next day, day-1, and the percentage of those that survive to the day after day, day-2, etc. You repeat this for all users. Here’s an example:

 

We can see immediately from this, that of all the users profiled, 50% make it to the second day, by day-100 approximately 25% of users are still active, and by day-300, only 6% or so of users are still using the app. As they show a theoretically smooth curve, survival curves are a great way of finding “choke points” in the app, or times when a greater portion of your users are disengaging and dropping off.

By drilling into different cohorts’ survival curves you can see interesting trends emerge.  Here’s an example of an app’s survival curves plotted by monthly cohorts (splitting users by their month of installing the app). You can clearly see trending, and also an early month (with low user numbers, and thus a stepped graph, where there’s a big drop in surviving users after day-6).  This might suggest a weekly trend, that if you manage to convince the users to stay for a full week, they see some additional value, and tend to hang around longer.

 

If you’re interested in digging into survival curves for your app, we’d be happy to help! Drop me a line...