Regression Analysis
Linear Regression
axle.ml.LinearRegression
makes use of axle.algebra.LinearAlgebra
.
See the wikipedia page on Linear Regression
Example: Home Prices
case class RealtyListing(size: Double, bedrooms: Int, floors: Int, age: Int, price: Double)
val listings = List(
RealtyListing(2104, 5, 1, 45, 460d),
RealtyListing(1416, 3, 2, 40, 232d),
RealtyListing(1534, 3, 2, 30, 315d),
RealtyListing(852, 2, 1, 36, 178d))
Create a price estimator using linear regression.
import cats.implicits._
import spire.algebra.Rng
import spire.algebra.NRoot
import axle.jblas._
implicit val rngDouble: Rng[Double] = spire.implicits.DoubleAlgebra
implicit val nrootDouble: NRoot[Double] = spire.implicits.DoubleAlgebra
implicit val laJblasDouble = axle.jblas.linearAlgebraDoubleMatrix[Double]
implicit val rngInt: Rng[Int] = spire.implicits.IntAlgebra
import axle.ml.LinearRegression
val priceEstimator = LinearRegression(
listings,
numFeatures = 4,
featureExtractor = (rl: RealtyListing) => (rl.size :: rl.bedrooms.toDouble :: rl.floors.toDouble :: rl.age.toDouble :: Nil),
objectiveExtractor = (rl: RealtyListing) => rl.price,
α = 0.1,
iterations = 100)
Use the estimator
priceEstimator(RealtyListing(1416, 3, 2, 40, 0d))
// res0: Double = 288.60017635814035
Create a Plot of the error during the training
import axle.visualize._
import axle.algebra.Plottable._
val errorPlot = Plot(
() => List(("error" -> priceEstimator.errTree)),
connect = true,
drawKey = true,
colorOf = (label: String) => Color.black,
title = Some("Linear Regression Error"),
xAxis = Some(0d),
xAxisLabel = Some("step"),
yAxis = Some(0),
yAxisLabel = Some("error"))
Create the SVG
import axle.web._
import cats.effect._
errorPlot.svg[IO]("docwork/images/lrerror.svg").unsafeRunSync()
Logistic Regression
WARNING: implementation is incorrect
axle.ml.LogisticRegression
makes use of axle.algebra.LinearAlgebra
.
See the wikipedia page on Logistic Regression
Example: Test Pass Probability
Predict Test Pass Probability as a Function of Hours Studied
case class Student(hoursStudied: Double, testPassed: Boolean)
val data = List(
Student(0.50, false),
Student(0.75, false),
Student(1.00, false),
Student(1.25, false),
Student(1.50, false),
Student(1.75, false),
Student(1.75, true),
Student(2.00, false),
Student(2.25, true),
Student(2.50, false),
Student(2.75, true),
Student(3.00, false),
Student(3.25, true),
Student(3.50, false),
Student(4.00, true),
Student(4.25, true),
Student(4.50, true),
Student(4.75, true),
Student(5.00, true),
Student(5.50, true)
)
Create a test pass probability function using logistic regression.
import spire.algebra.Rng
import spire.algebra.NRoot
import axle.jblas._
implicit val rngDouble: Rng[Double] = spire.implicits.DoubleAlgebra
// rngDouble: Rng[Double] = spire.std.DoubleAlgebra@3825d648
implicit val nrootDouble: NRoot[Double] = spire.implicits.DoubleAlgebra
// nrootDouble: NRoot[Double] = spire.std.DoubleAlgebra@3825d648
implicit val laJblasDouble = axle.jblas.linearAlgebraDoubleMatrix[Double]
// laJblasDouble: axle.algebra.LinearAlgebra[org.jblas.DoubleMatrix, Int, Int, Double] = axle.jblas.package$$anon$5@469c0c69
implicit val rngInt: Rng[Int] = spire.implicits.IntAlgebra
// rngInt: Rng[Int] = spire.std.IntAlgebra@5a0d8dc7
import axle.ml.LogisticRegression
val featureExtractor = (s: Student) => (s.hoursStudied :: Nil)
// featureExtractor: Student => List[Double] = <function1>
val objectiveExtractor = (s: Student) => s.testPassed
// objectiveExtractor: Student => Boolean = <function1>
val pTestPass = LogisticRegression(
data,
1,
featureExtractor,
objectiveExtractor,
0.1,
10)
// pTestPass: LogisticRegression[Student, org.jblas.DoubleMatrix] = LogisticRegression(
// examples = List(
// Student(hoursStudied = 0.5, testPassed = false),
// Student(hoursStudied = 0.75, testPassed = false),
// Student(hoursStudied = 1.0, testPassed = false),
// Student(hoursStudied = 1.25, testPassed = false),
// Student(hoursStudied = 1.5, testPassed = false),
// Student(hoursStudied = 1.75, testPassed = false),
// Student(hoursStudied = 1.75, testPassed = true),
// Student(hoursStudied = 2.0, testPassed = false),
// Student(hoursStudied = 2.25, testPassed = true),
// Student(hoursStudied = 2.5, testPassed = false),
// Student(hoursStudied = 2.75, testPassed = true),
// Student(hoursStudied = 3.0, testPassed = false),
// Student(hoursStudied = 3.25, testPassed = true),
// Student(hoursStudied = 3.5, testPassed = false),
// Student(hoursStudied = 4.0, testPassed = true),
// Student(hoursStudied = 4.25, testPassed = true),
// Student(hoursStudied = 4.5, testPassed = true),
// Student(hoursStudied = 4.75, testPassed = true),
// Student(hoursStudied = 5.0, testPassed = true),
// Student(hoursStudied = 5.5, testPassed = true)
// ),
// numFeatures = 1,
// featureExtractor = <function1>,
// objectiveExtractor = <function1>,
// α = 0.1,
// numIterations = 10
// )
Use the estimator
testPassProbability(2d :: Nil)
(Note: The implementation is incorrect, so the result is elided until the error is fixed)
Future Work
Fix Logistic Regression