P-hacking and false positives are rampant in the world of academic publishing, where there’s peer review. I shudder to think how much more of it goes on in the world of web A/B testing…

]]>The other two are definitely matters of personal preference. I choose to write it that way to make sure it’s explicit what’s going on. I do a lot of things like this… for example, if I want to do a cartesian product join, I’ll join on 1 = 1. It’s unnecessary, sure, but it tells me “yes, I really did want to do a cartesian product join here.”

]]>• instead of `cast(XX as float)` you could write XX :: float

• also no need to cast both numerators and denominators as float, Redshift will execute the computation using the most precise type of the two numbers – e.g. if you divide an integer by a float or conversely, Redshift will express the output as a float

• instead of using the function `dateadd` you can simply add an integer to a date – e.g. ‘2017-06-20’ +7 is interpreted as ‘2017-06-27’ ]]>

Not sure if you saw this post by Chris Stucchio on why bandits are most likely not well-suited for A/B testing: https://www.chrisstucchio.com/blog/2015/dont_use_bandits.html Basically, you can’t apply them without breaking some of their fundamental assumptions.

The same is true for classical tests of statistical significance, where one needs to have their sample size fixed in advance. With all the pressure you get from stakeholders to deliver results early and stop losers and promote winners, it’s near-impossible to apply in practice, even with an informed audience.

If you are still looking for efficient ways to run A/B tests, I’d suggest giving AGILE A/B Testing – a new statistical approach to A/B testing that borrows from medical trial design, a try.

]]>If I used WTT RNN for predicting, and use R survival package to build the probability chart like above article, I am afraid that the results may not complement each other and my user/audience will get confused. Can I use this R survival package to predict instead? So predict from the survfit model?

Or, would you suggest to shift to WTT RNN from the get go?

]]>http://daynebatten.com/2017/02/recurrent-neural-networks-churn/

]]>I don’t have any experience with web scraping in Python. I’ve used the Requests library, which is great for making HTTP requests, but I suspect Python has some really good packages for doing a lot of the DOM parsing for you and extracting information really easily.

Best of luck, and sorry I can’t be of further assistance!

]]>Nice writing. I understand that this survival analysis could provide a high level information about the probability of surviving by time. What I am trying to do is to predict from the model to identify what’s probability of surviving of a specific customer, how long would this person stay and it’s probability. is this something that can be done?

Thanks in advance.

]]>