In a past post on analyzing churn in the subscription or Software as a Service business, I talked about two different ways to quantify the dollar cost of churn. You could use 1 / churn as an estimation of mean customer lifetime (though this simple method makes a lot of assumptions). Or, you could use “pseudo-observations” to calculate the dollar value of certain groups of customers during a particular time period (which doesn’t let you quantify the full lifetime value of a customer).

But what if there was another way? What if we took our Kaplan-Meier best estimate of our churn curve, fit a linear model to *that* model, and then projected it out?

Well, as it turns out, we’d get a reasonable estimation of our lifetime churn curve, which would let us estimate average customer lifetime, and customer lifetime value. Let’s get started.

## Fitting the basic curve

I’ve blogged about creating Kaplan-Meier estimators of churn curves in the past, so I’m going to assume everybody is up on the details. Suffice it to say, you’re just creating a graph of the percentage of customers that are still subscribed to your service a given number of time periods after signing up. It looks something like this (with a 95% confidence interval included):

The code for fitting one of these curves (for a fictional guitar-tab subscription service called NetLixx) is shown below. (And you can download a csv of the raw data here.) We just use R’s survival package to fit a curve, then extract the mean value. Like so:

If you’re confused about how all this works, do be sure to read the earlier post. Otherwise, keep moving on for the meat of this post!

## Modeling it

OK, so we have a survival curve that looks to be almost exactly a year long. But a good 80% of customers are still with the company. How can we know what mean customer lifetime is if most of the customers haven’t even churned yet? We’ll create a model of churn, then project it out into the future!

We’re just going to create a basic linear model, with one little complication: we’re going to model the *logged* value of our survival curve. There are a couple of good reasons for this:

- The value of our survival curve will never go below 0. You can’t have -1% of your customers remaining. If we can help it, we don’t want to project impossible values (even at the extremes).
- It makes sense that the distribution of survival times will have a positive skew. The shortest a customer can survive is 0 days. But, even if our service is bad, there’s gonna be somebody out there who won’t give up on us until somebody pries our service out of their cold, dead, fingers.

This is actually a ridiculously simple process. We create an X variable (which is just a representation of time), fit a model, and then build an equation using the coefficients from our fitted linear model. We’ll also plot the results. Like this…

Our projection leads to a graph that looks something like this. (This projection is for an extra 1000 days, but you could go as far as you wanted.) Not too shabby!

Of course the real magic here is that last line. If we integrate our projected survival curve from day 0 to day Infinity, we get our mean customer lifetime! In this case, the answer comes out to 1,391 days or around 45 months. If we multiply this by monthly revenue, we get a projected calculation of customer lifetime value.

## Conclusion

Of course, this methodology makes a lot of assumptions. (Like, really. A *lot* of assumptions. Projecting beyond existing data is always dangerous territory.) But, in a situation where you know your average customer lifetime is longer than your oldest customer has been around, you’re not going to find a methodology that *doesn’t* make a lot of assumptions. Your best bet is to document your assumptions, give everything a gut-check, and go from there.

Let me know if you have any thoughts or suggested improvements!

flyunicornOctober 11, 2015 / 9:32 amDear Dayne,

I liked your series posts on survival analysis. I have a few questions on the mean customer lifetime predicted in this post and the one used “rmeans” in previous post before (the one about RMST).

1. What is the difference between these two predicted average survival time? Which one is more accurate?

2. When using Cox Regression, we can see multiple variables’ effect on churn rate. But there seems no predict function associated with Cox? Do you know ways to predict churn rate after building a Cox model? It would be great if you can write code and comments in the reply. Thanks!

daynebattenOctober 12, 2015 / 1:40 pmGood questions.

1. The difference between the predicted survival times in this post and the ones from the RMST post is that this methodology allows you to project total mean survival time for your cohort, while the other is truncated (hence the name “restricted”). RMST gives you the mean survival time within the first year, for example, so it could never be more than 365 days. This methodology will take 365 days of data and project it out until your entire cohort has churned / is dead / whatever. As for which is more accurate, RMST is definitely more accurate, but it doesn’t tell you anything about the future. This methodology is potentially less accurate (it’s a model of a model, essentially, and each are prone to error), but tells you something more about the future state of things. It’s a tradeoff.

2. You can predict from the results of a cox model using the traditional survfit function. Documentation, including examples, is here: https://stat.ethz.ch/R-manual/R-devel/library/survival/html/survfit.coxph.html

Frank SauvageOctober 26, 2015 / 5:22 pmThank you for this additional paper about projection of future churn!