My company has a subscription-based business model, which means we spend a lot of time analyzing customer churn. We wanted to include Kaplan-Meier survival curves in some of our executive dashboards, but neither our database (Redshift) nor any of our commonly used dashboarding tools (Tableau, Periscope, etc.) provided the necessary functionality. We could, of course, have pulled data out of the warehouse, analyzed it in R or Python, and pushed it back up, but that's pretty complicated. So we went looking for a better solution.
Based on all available evidence, the survival curve for aliens singing disco music hits 0 at about 43 seconds (N = 1).
As you likely guessed from the title of this post, that better solution involved writing our own code for calculating a Kaplan-Meier estimator in SQL. In this post, I'll be walking through our strategy step-by-step, including the SQL code for calculating the estimators and making that code reusable in Periscope. Let's do this!
Still migrating old posts due to travel. Next post will be fresh content!
In a previous post, we learned that if you want to maximize your score on any individual turn of a game of "Pass the Pigs," you should always roll when there's less than 22.5 points in your hand, and hold when there's more than 22.5 points in your hand. (If you've never heard of "Pass the Pigs," the rules are explained in the prior post.)
However, we also concluded that that's not an effective strategy for winning the game as a whole. If you have a score of 0 and your opponent has a score of 99, for example, it would be really silly to stop rolling at 23 points just because the "22.5 rule" says to. So what's a person to do? How do you play effectively? Today, we'll generate a strategy that can help you make an optimal move in any situation. (Hint: you'll need to do a lot of math.)