Statistics
Pythagorean Means
Arithmetic, Geometric, and Harmonic Means are all 'Pythagorean'.
See the wikipedia page on Pythagorean Means for more.
Arithmetic, Geometric, and Harmonic Mean Examples
Imports
import cats.implicits._
import spire.math.Real
import spire.algebra.Field
import spire.algebra.NRoot
import axle.math._
implicit val fieldDouble: Field[Double] = spire.implicits.DoubleAlgebra
implicit val nrootDouble: NRoot[Double] = spire.implicits.DoubleAlgebra
Examples
Arithmetic mean
arithmeticMean(List(2d, 3d, 4d, 5d))
// res0: Double = 3.5
Geometric mean
geometricMean[Real, List](List(1d, 5d, 25d))
// res1: Real = Inexact(
// f = spire.math.Real$$Lambda$9061/0x00000008029f21f8@7c897b00
// )
Harmonic mean
harmonicMean(List(2d, 3d, 4d, 5d))
// res2: Double = 3.116883116883117
Generalized Mean
See the wikipedia page on Generalized Mean.
When the parameter p
is 1, it is the arithmetic mean.
generalizedMean[Double, List](1d, List(2d, 3d, 4d, 5d))
// res3: Double = 3.5
As p
approaches 0, it is the geometric mean.
generalizedMean[Double, List](0.0001, List(1d, 5d, 25d))
// res4: Double = 5.000431733701651
At -1 it is the harmonic mean.
generalizedMean[Double, List](-1d, List(2d, 3d, 4d, 5d))
// res5: Double = 3.116883116883117
Moving means
import spire.math._
Moving arithmetic mean
movingArithmeticMean[List, Int, Double](
(1 to 100).toList.map(_.toDouble),
5)
// res6: List[Double] = List(
// 3.0,
// 4.0,
// 5.0,
// 6.0,
// 7.0,
// 8.0,
// 9.0,
// 10.0,
// 11.0,
// 12.0,
// 13.0,
// 14.0,
// 15.0,
// ...
Moving geometric mean
movingGeometricMean[List, Int, Real](
List(1d, 5d, 25d, 125d, 625d),
3)
// res7: List[Real] = List(
// Inexact(f = spire.math.Real$$Lambda$9061/0x00000008029f21f8@39749105),
// Inexact(f = spire.math.Real$$Lambda$8953/0x00000008029b6580@5ca9e6bb),
// Inexact(f = spire.math.Real$$Lambda$8953/0x00000008029b6580@7874dc8)
// )
Moving harmonic mean
movingHarmonicMean[List, Int, Real](
(1 to 5).toList.map(v => Real(v)),
3)
// res8: List[Real] = List(
// Exact(n = 18/11),
// Exact(n = 36/13),
// Exact(n = 180/47)
// )
Mean Average Precision at K
See the page on mean average precision at Kaggle
import spire.math.Rational
import axle.ml.RankedClassifierPerformance._
Examples (from benhamner/Metrics)
meanAveragePrecisionAtK[Int, Rational](List(1 until 5), List(1 until 5), 3)
// res10: Rational = 1
meanAveragePrecisionAtK[Int, Rational](List(List(1, 3, 4), List(1, 2, 4), List(1, 3)), List(1 until 6, 1 until 6, 1 until 6), 3)
// res11: Rational = 37/54
meanAveragePrecisionAtK[Int, Rational](List(1 until 6, 1 until 6), List(List(6, 4, 7, 1, 2), List(1, 1, 1, 1, 1)), 5)
// res12: Rational = 13/50
meanAveragePrecisionAtK[Int, Rational](List(List(1, 3), List(1, 2, 3), List(1, 2, 3)), List(1 until 6, List(1, 1, 1), List(1, 2, 1)), 3)
// res13: Rational = 11/18
Uniform Distribution
Imports and implicits (for all sections below)
import cats.implicits._
import spire.algebra._
import axle.probability._
implicit val fieldDouble: Field[Double] = spire.implicits.DoubleAlgebra
Example
val X = uniformDistribution(List(2d, 4d, 4d, 4d, 5d, 5d, 7d, 9d))
// X: ConditionalProbabilityTable[Double, spire.math.Rational] = ConditionalProbabilityTable(
// p = HashMap(5.0 -> 1/4, 9.0 -> 1/8, 2.0 -> 1/8, 7.0 -> 1/8, 4.0 -> 3/8)
// )
Standard Deviation
Example
import axle.stats._
implicit val nrootDouble: NRoot[Double] = spire.implicits.DoubleAlgebra
standardDeviation(X)
// res15: Double = 2.0
See also Probability Model
Root-mean-square deviation
See the Wikipedia page on Root-mean-square deviation.
import cats.implicits._
import spire.algebra.Field
import spire.algebra.NRoot
import axle.stats._
implicit val fieldDouble: Field[Double] = spire.implicits.DoubleAlgebra
implicit val nrootDouble: NRoot[Double] = spire.implicits.DoubleAlgebra
Given four numbers and an estimator function, compute the RMSD:
val data = List(1d, 2d, 3d, 4d)
def estimator(x: Double): Double =
x + 0.2
rootMeanSquareDeviation[List, Double](data, estimator)
// res17: Double = 0.4000000000000002
Reservoir Sampling
Reservoir Sampling is the answer to a common interview question.
import spire.random.Generator.rng
import spire.algebra.Field
implicit val fieldDouble: Field[Double] = spire.implicits.DoubleAlgebra
import axle.stats._
Demonstrate it uniformly sampling 15 of the first 100 integers
val sample = reservoirSampleK(15, LazyList.from(1), rng).drop(100).head
// sample: List[Int] = List(
// 101,
// 88,
// 85,
// 77,
// 67,
// 61,
// 59,
// 56,
// 55,
// 51,
// 45,
// 40,
// 39,
// 23,
// 4
// )
The mean of the sample should be in the ballpark of the mean of the entire list (50.5):
import axle.math.arithmeticMean
arithmeticMean(sample.map(_.toDouble))
// res19: Double = 56.733333333333334
Indeed it is.
Future Work
Clarify imports starting with uniformDistribution