Classification
Naive Bayes
Naïve Bayes
Example: Tennis and Weather
case class Tennis(outlook: String, temperature: String, humidity: String, wind: String, play: Boolean)
val events = List(
Tennis("Sunny", "Hot", "High", "Weak", false),
Tennis("Sunny", "Hot", "High", "Strong", false),
Tennis("Overcast", "Hot", "High", "Weak", true),
Tennis("Rain", "Mild", "High", "Weak", true),
Tennis("Rain", "Cool", "Normal", "Weak", true),
Tennis("Rain", "Cool", "Normal", "Strong", false),
Tennis("Overcast", "Cool", "Normal", "Strong", true),
Tennis("Sunny", "Mild", "High", "Weak", false),
Tennis("Sunny", "Cool", "Normal", "Weak", true),
Tennis("Rain", "Mild", "Normal", "Weak", true),
Tennis("Sunny", "Mild", "Normal", "Strong", true),
Tennis("Overcast", "Mild", "High", "Strong", true),
Tennis("Overcast", "Hot", "Normal", "Weak", true),
Tennis("Rain", "Mild", "High", "Strong", false))
Build a classifier to predict the Boolean feature 'play' given all the other features of the observations
import cats.implicits._
import spire.math._
import axle._
import axle.probability._
import axle.ml.NaiveBayesClassifier
val classifier = NaiveBayesClassifier[Tennis, String, Boolean, List, Rational](
events,
List(
(Variable[String]("Outlook") -> Vector("Sunny", "Overcast", "Rain")),
(Variable[String]("Temperature") -> Vector("Hot", "Mild", "Cool")),
(Variable[String]("Humidity") -> Vector("High", "Normal", "Low")),
(Variable[String]("Wind") -> Vector("Weak", "Strong"))),
(Variable[Boolean]("Play") -> Vector(true, false)),
(t: Tennis) => t.outlook :: t.temperature :: t.humidity :: t.wind :: Nil,
(t: Tennis) => t.play)
Use the classifier to predict:
events map { datum => datum.toString + "\t" + classifier(datum) } mkString("\n")
// res1: String = """Tennis(Sunny,Hot,High,Weak,false) false
// Tennis(Sunny,Hot,High,Strong,false) false
// Tennis(Overcast,Hot,High,Weak,true) true
// Tennis(Rain,Mild,High,Weak,true) true
// Tennis(Rain,Cool,Normal,Weak,true) true
// Tennis(Rain,Cool,Normal,Strong,false) true
// Tennis(Overcast,Cool,Normal,Strong,true) true
// Tennis(Sunny,Mild,High,Weak,false) false
// Tennis(Sunny,Cool,Normal,Weak,true) true
// Tennis(Rain,Mild,Normal,Weak,true) true
// Tennis(Sunny,Mild,Normal,Strong,true) true
// Tennis(Overcast,Mild,High,Strong,true) true
// Tennis(Overcast,Hot,Normal,Weak,true) true
// Tennis(Rain,Mild,High,Strong,false) false"""
Measure the classifier's performance
import axle.ml.ClassifierPerformance
ClassifierPerformance[Rational, Tennis, List](events, classifier, _.play).show
// res2: String = """Precision 9/10
// Recall 1
// Specificity 4/5
// Accuracy 13/14
// F1 Score 18/19
// """
See Precision and Recall for the definition of the performance metrics.