Naive Bayes

Naïve Bayes

Example: Tennis and Weather

case class Tennis(outlook: String, temperature: String, humidity: String, wind: String, play: Boolean)

val events = List(
  Tennis("Sunny", "Hot", "High", "Weak", false),
  Tennis("Sunny", "Hot", "High", "Strong", false),
  Tennis("Overcast", "Hot", "High", "Weak", true),
  Tennis("Rain", "Mild", "High", "Weak", true),
  Tennis("Rain", "Cool", "Normal", "Weak", true),
  Tennis("Rain", "Cool", "Normal", "Strong", false),
  Tennis("Overcast", "Cool", "Normal", "Strong", true),
  Tennis("Sunny", "Mild", "High", "Weak", false),
  Tennis("Sunny", "Cool", "Normal", "Weak", true),
  Tennis("Rain", "Mild", "Normal", "Weak", true),
  Tennis("Sunny", "Mild", "Normal", "Strong", true),
  Tennis("Overcast", "Mild", "High", "Strong", true),
  Tennis("Overcast", "Hot", "Normal", "Weak", true),
  Tennis("Rain", "Mild", "High", "Strong", false))

Build a classifier to predict the Boolean feature 'play' given all the other features of the observations

import cats.implicits._

import spire.math._

import axle._
import axle.probability._
val classifier = NaiveBayesClassifier[Tennis, String, Boolean, List, Rational](
    (Variable[String]("Outlook") -> Vector("Sunny", "Overcast", "Rain")),
    (Variable[String]("Temperature") -> Vector("Hot", "Mild", "Cool")),
    (Variable[String]("Humidity") -> Vector("High", "Normal", "Low")),
    (Variable[String]("Wind") -> Vector("Weak", "Strong"))),
  (Variable[Boolean]("Play") -> Vector(true, false)),
  (t: Tennis) => t.outlook :: t.temperature :: t.humidity :: t.wind :: Nil,
  (t: Tennis) =>

Use the classifier to predict:

events map { datum => datum.toString + "\t" + classifier(datum) } mkString("\n")
// res1: String = """Tennis(Sunny,Hot,High,Weak,false)	false
// Tennis(Sunny,Hot,High,Strong,false)	false
// Tennis(Overcast,Hot,High,Weak,true)	true
// Tennis(Rain,Mild,High,Weak,true)	true
// Tennis(Rain,Cool,Normal,Weak,true)	true
// Tennis(Rain,Cool,Normal,Strong,false)	true
// Tennis(Overcast,Cool,Normal,Strong,true)	true
// Tennis(Sunny,Mild,High,Weak,false)	false
// Tennis(Sunny,Cool,Normal,Weak,true)	true
// Tennis(Rain,Mild,Normal,Weak,true)	true
// Tennis(Sunny,Mild,Normal,Strong,true)	true
// Tennis(Overcast,Mild,High,Strong,true)	true
// Tennis(Overcast,Hot,Normal,Weak,true)	true
// Tennis(Rain,Mild,High,Strong,false)	false"""

Measure the classifier's performance


ClassifierPerformance[Rational, Tennis, List](events, classifier,
// res2: String = """Precision   9/10
// Recall      1
// Specificity 4/5
// Accuracy    13/14
// F1 Score    18/19
// """

See Precision and Recall for the definition of the performance metrics.