Statistics

Pythagorean Means

Arithmetic, Geometric, and Harmonic Means are all 'Pythagorean'.

See the wikipedia page on Pythagorean Means for more.

Arithmetic, Geometric, and Harmonic Mean Examples

Imports

``````import cats.implicits._

import spire.math.Real
import spire.algebra.Field
import spire.algebra.NRoot

import axle.math._

implicit val fieldDouble: Field[Double] = spire.implicits.DoubleAlgebra
implicit val nrootDouble: NRoot[Double] = spire.implicits.DoubleAlgebra``````

Examples

Arithmetic mean

``````arithmeticMean(List(2d, 3d, 4d, 5d))
// res0: Double = 3.5``````

Geometric mean

``````geometricMean[Real, List](List(1d, 5d, 25d))
// res1: Real = Inexact(
//   f = spire.math.Real\$\$Lambda\$9061/0x00000008029f21f8@7c897b00
// )``````

Harmonic mean

``````harmonicMean(List(2d, 3d, 4d, 5d))
// res2: Double = 3.116883116883117``````

Generalized Mean

See the wikipedia page on Generalized Mean.

When the parameter `p` is 1, it is the arithmetic mean.

``````generalizedMean[Double, List](1d, List(2d, 3d, 4d, 5d))
// res3: Double = 3.5``````

As `p` approaches 0, it is the geometric mean.

``````generalizedMean[Double, List](0.0001, List(1d, 5d, 25d))
// res4: Double = 5.000431733701651``````

At -1 it is the harmonic mean.

``````generalizedMean[Double, List](-1d, List(2d, 3d, 4d, 5d))
// res5: Double = 3.116883116883117``````

Moving means

``import spire.math._``

Moving arithmetic mean

``````movingArithmeticMean[List, Int, Double](
(1 to 100).toList.map(_.toDouble),
5)
// res6: List[Double] = List(
//   3.0,
//   4.0,
//   5.0,
//   6.0,
//   7.0,
//   8.0,
//   9.0,
//   10.0,
//   11.0,
//   12.0,
//   13.0,
//   14.0,
//   15.0,
// ...``````

Moving geometric mean

``````movingGeometricMean[List, Int, Real](
List(1d, 5d, 25d, 125d, 625d),
3)
// res7: List[Real] = List(
//   Inexact(f = spire.math.Real\$\$Lambda\$9061/0x00000008029f21f8@39749105),
//   Inexact(f = spire.math.Real\$\$Lambda\$8953/0x00000008029b6580@5ca9e6bb),
//   Inexact(f = spire.math.Real\$\$Lambda\$8953/0x00000008029b6580@7874dc8)
// )``````

Moving harmonic mean

``````movingHarmonicMean[List, Int, Real](
(1 to 5).toList.map(v => Real(v)),
3)
// res8: List[Real] = List(
//   Exact(n = 18/11),
//   Exact(n = 36/13),
//   Exact(n = 180/47)
// )``````

Mean Average Precision at K

See the page on mean average precision at Kaggle

``````import spire.math.Rational
import axle.ml.RankedClassifierPerformance._``````

Examples (from benhamner/Metrics)

``````meanAveragePrecisionAtK[Int, Rational](List(1 until 5), List(1 until 5), 3)
// res10: Rational = 1``````
``````meanAveragePrecisionAtK[Int, Rational](List(List(1, 3, 4), List(1, 2, 4), List(1, 3)), List(1 until 6, 1 until 6, 1 until 6), 3)
// res11: Rational = 37/54``````
``````meanAveragePrecisionAtK[Int, Rational](List(1 until 6, 1 until 6), List(List(6, 4, 7, 1, 2), List(1, 1, 1, 1, 1)), 5)
// res12: Rational = 13/50``````
``````meanAveragePrecisionAtK[Int, Rational](List(List(1, 3), List(1, 2, 3), List(1, 2, 3)), List(1 until 6, List(1, 1, 1), List(1, 2, 1)), 3)
// res13: Rational = 11/18``````

Uniform Distribution

Imports and implicits (for all sections below)

``````import cats.implicits._
import spire.algebra._
import axle.probability._

implicit val fieldDouble: Field[Double] = spire.implicits.DoubleAlgebra``````

Example

``````val X = uniformDistribution(List(2d, 4d, 4d, 4d, 5d, 5d, 7d, 9d))
// X: ConditionalProbabilityTable[Double, spire.math.Rational] = ConditionalProbabilityTable(
//   p = HashMap(5.0 -> 1/4, 9.0 -> 1/8, 2.0 -> 1/8, 7.0 -> 1/8, 4.0 -> 3/8)
// )``````

Standard Deviation

Example

``````import axle.stats._

implicit val nrootDouble: NRoot[Double] = spire.implicits.DoubleAlgebra``````
``````standardDeviation(X)
// res15: Double = 2.0``````

See also Probability Model

Root-mean-square deviation

See the Wikipedia page on Root-mean-square deviation.

``````import cats.implicits._

import spire.algebra.Field
import spire.algebra.NRoot

import axle.stats._

implicit val fieldDouble: Field[Double] = spire.implicits.DoubleAlgebra
implicit val nrootDouble: NRoot[Double] = spire.implicits.DoubleAlgebra``````

Given four numbers and an estimator function, compute the RMSD:

``val data = List(1d, 2d, 3d, 4d)``
``````def estimator(x: Double): Double =
x + 0.2

rootMeanSquareDeviation[List, Double](data, estimator)
// res17: Double = 0.4000000000000002``````

Reservoir Sampling

Reservoir Sampling is the answer to a common interview question.

``````import spire.random.Generator.rng
import spire.algebra.Field

implicit val fieldDouble: Field[Double] = spire.implicits.DoubleAlgebra

import axle.stats._``````

Demonstrate it uniformly sampling 15 of the first 100 integers

``````val sample = reservoirSampleK(15, LazyList.from(1), rng).drop(100).head
// sample: List[Int] = List(
//   101,
//   88,
//   85,
//   77,
//   67,
//   61,
//   59,
//   56,
//   55,
//   51,
//   45,
//   40,
//   39,
//   23,
//   4
// )``````

The mean of the sample should be in the ballpark of the mean of the entire list (50.5):

``````import axle.math.arithmeticMean

arithmeticMean(sample.map(_.toDouble))
// res19: Double = 56.733333333333334``````

Indeed it is.

Future Work

Clarify imports starting with `uniformDistribution`