Naive Bayes is an extraordinarily diverse algorithm for categorizing things… it can separate fraudulent from non-fraudulent credit card purchases, spam from legitimate email, or dinosaurs from fictional monsters. It’s most frequent application is in “bag of words” models… statistical models that classify a block of text based on the words that appear within it.
In this post, we’ll explore the basics of Naive Bayes classification, and we’ll illustrate using a relatively common problem – assigning genders to a group of people when all you have is their names. And, just for kicks and giggles, we’re going to do it in SQL. Because nobody trains models in SQL.Continue reading