Probability Model

Modeling probability, randomness, and uncertainly is one of the primary objectives of Axle.

The capabilies are available via four typeclasses and one trait

Sampler
Region (trait modeling σ-algebra)
Kolmogorov
Bayes
Monad (cats.Monad)

Concrete number type are avoided in favor of structures from Abstract Algebra -- primarily Ring and Field. These are represented as context bounds, usually passed implicitly.

The examples in this document use the spire.math.Rational number type, but work as well for Double, Float, etc. The precise number type Rational is used in tests because their precision allows the assertions to be expressed without any error tolerance.

Imports

Preamble to pull in the commonly-used functions in this section:

import cats.implicits._
import cats.effect._

import spire.math._
import spire.algebra._

import axle.probability._
import axle.algebra._
import axle.visualize._
import axle.web._

Creating Probability Models

There are a few type of probability models in Axle. The simplest is the ConditionalProbabilityTable, which is used throughout this document.

axle.data.Coin.flipModel demonstrates a very simple probability model for type Symbol.

This is its implementation:

val head = Symbol("HEAD")
val tail = Symbol("TAIL")

def flipModel(pHead: Rational = Rational(1, 2)): ConditionalProbabilityTable[Symbol, Rational] =
  ConditionalProbabilityTable[Symbol, Rational](
    Map(
      head -> pHead,
      tail -> (1 - pHead)))

Its argument is the bias for the HEAD side. Without a provided bias, it is assumed to be a fair coin.

val fairCoin = flipModel()
// fairCoin: ConditionalProbabilityTable[Symbol, Rational] = ConditionalProbabilityTable(
//   p = Map('HEAD -> 1/2, 'TAIL -> 1/2)
// )

val biasedCoin = flipModel(Rational(9, 10))
// biasedCoin: ConditionalProbabilityTable[Symbol, Rational] = ConditionalProbabilityTable(
//   p = Map('HEAD -> 9/10, 'TAIL -> 1/10)
// )

Rolls of dice are another common example.

def rollModel(n: Int): ConditionalProbabilityTable[Int, Rational] =
  ConditionalProbabilityTable(
    (1 to n).map(i => (i, Rational(1, n.toLong))).toMap)

The values d6 and d10 model rolls of 6 and 10-sided dice.

val d6 = rollModel(6)
// d6: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
//   p = HashMap(5 -> 1/6, 1 -> 1/6, 6 -> 1/6, 2 -> 1/6, 3 -> 1/6, 4 -> 1/6)
// )

val d10 = rollModel(10)
// d10: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
//   p = HashMap(
//     5 -> 1/10,
//     10 -> 1/10,
//     1 -> 1/10,
//     6 -> 1/10,
//     9 -> 1/10,
//     2 -> 1/10,
//     7 -> 1/10,
//     3 -> 1/10,
//     8 -> 1/10,
//     4 -> 1/10
//   )
// )

Define a visualization of the distribution of events in the d6 model:

val d6vis = BarChart[Int, Rational, ConditionalProbabilityTable[Int, Rational], String](
  () => d6,
  colorOf = _ => Color.blue,
  xAxis = Some(Rational(0)),
  title = Some("d6"),
  labelAngle = Some(0d *: angleDouble.degree),
  drawKey = false)

Create an SVG

d6vis.svg[IO]("docwork/images/d6.svg").unsafeRunSync()

Sampler

The Sampler typeclass provides the ability to "execute" the model and product a random sample via the sample method.

It's type signature is:

def sample(gen: Generator)(implicit spireDist: Dist[V], ringV: Ring[V], orderV: Order[V]): A

These imports make available a Generator as source of entropy

import spire.random._

val rng = Random.generatorFromSeed(Seed(42))
// rng: spire.random.rng.Cmwc5 = spire.random.rng.Cmwc5@3fae93b6

And then the .sample syntax:

import axle.syntax.sampler._

sample requires a Spire Generator. It also requires context bounds on the value type V that give the method the ability to produces values with a distribution conforming to the probability model.

(1 to 10) map { _ => fairCoin.sample(rng) }
// res1: IndexedSeq[Symbol] = Vector(
//   'HEAD,
//   'HEAD,
//   'HEAD,
//   'TAIL,
//   'TAIL,
//   'HEAD,
//   'TAIL,
//   'TAIL,
//   'TAIL,
//   'HEAD
// )

(1 to 10) map { _ => biasedCoin.sample(rng) }
// res2: IndexedSeq[Symbol] = Vector(
//   'HEAD,
//   'HEAD,
//   'HEAD,
//   'HEAD,
//   'HEAD,
//   'HEAD,
//   'HEAD,
//   'HEAD,
//   'HEAD,
//   'HEAD
// )

(1 to 10) map { _ => d6.sample(rng) }
// res3: IndexedSeq[Int] = Vector(3, 2, 5, 2, 6, 2, 6, 1, 1, 5)

Simulate 1k rolls of one d6

val rolls = (0 until 1000) map { _ => d6.sample(rng) }

Then tally them

implicit val ringInt: CRing[Int] = spire.implicits.IntAlgebra

import axle.syntax.talliable._

val oneKd6Histogram = rolls.tally
// oneKd6Histogram: Map[Int, Int] = Map(
//   5 -> 167,
//   1 -> 160,
//   6 -> 176,
//   2 -> 171,
//   3 -> 164,
//   4 -> 162
// )

Create a visualization

val d6oneKvis = BarChart[Int, Int, Map[Int, Int], String](
  () => oneKd6Histogram,
  colorOf = _ => Color.blue,
  xAxis = Some(0),
  title = Some("1,000 d6 samples"),
  labelAngle = Some(0d *: angleDouble.degree),
  drawKey = false)

Create the SVG

d6oneKvis.svg[IO]("docwork/images/d6-1Ksamples.svg").unsafeRunSync()

1k d6 samples

Sigma Algebra Regions

The sealed Region[A] trait is extended by the following case classes that form a way to describe expressions on the event-space of a probability model. In Measure Theory, these expressions are said to form a "sigma-algebra" ("σ-algebra")

In order of arity, they case classes extending this trait are:

Arity 0

RegionEmpty never matches any events or probability mass
RegionAll always matches all events and probability mass

Arity 1 (not including typeclass witnesses)

RegionEq matches when an event is equal to the supplied object, with respect to the supplied cats.kernel.Eq[A] witness.
RegionIf matches when the supplied condition function returns true
RegionSet matches when an event is contained within the supplied Set[A].
RegionNegate negates the supplied region.
RegionGTE is true when an event is greater than or equal to the supplied object, with respect to the supplied cats.kernel.Order[A]
RegionLTE is true when an event is less than or equal to the supplied object, with respect to the supplied cats.kernel.Order[A]

Arity 2

RegionAnd is the conjunction of both arguments. It can be created using the and method in the Region trait.
RegionOr is the disjunction of both arguments. It can be created using the or method in the Region trait.

Note that a "Random Variable" does not appear in this discussion. The axle.probability.Variable class does define a is method that returns a RegionEq, but the probability models themselves are not concerned with the notion of a Variable. They are simply models over regions of events on their single, opaque type that adhere to the laws of probability.

The eventual formalization of Region should connect it with a σ-algebra from Meaasure Theory.

Kolmogorov for querying Probability Models

probabilityOf (aka "P")

The method probabilityOf is defined in terms of a Region.

def probabilityOf(predicate: Region[A])(implicit fieldV: Field[V]): V

Note that probabilityOf is aliased to P in axle.syntax.kolmogorov._

import axle.syntax.kolmogorov._

The probability of a head for a single toss of a fair coin is 1/2

fairCoin.P(RegionEq(head))
// res5: Rational = 1/2

The probability that a toss is not head is also 1/2.

fairCoin.P(RegionNegate(RegionEq(head)))
// res6: Rational = 1/2

The probability that a toss is both head and tail is zero.

fairCoin.P(RegionEq(head) and RegionEq(tail))
// res7: Rational = 0

The probability that a toss is either head or tail is one.

fairCoin.P(RegionEq(head) or RegionEq(tail))
// res8: Rational = 1

Kolmogorov's Axioms

The single probabilityOf method together with the Region trait is enough to define Kolmogorov's Axioms of Probability. The axioms are implemented in axle.laws.KolmogorovProbabilityAxioms and checked during testing with ScalaCheck.

Basic Measure

Probabilities are non-negative.

model.P(region) >= Field[V].zero

Unit Measure

The sum the probabilities of all possible events is one

model.P(RegionAll()) === Field[V].one

Combination

For disjoint event regions, e1 and e2, the probability of their disjunction e1 or e2 is equal to the sum of their independent probabilities.

(!((e1 and e2) === RegionEmpty() )) || (model.P(e1 or e2) === model.P(e1) + model.P(e2))

Bayes Theorem, Conditioning, and Filtering

The Bayes typeclass implements the conditioning of a probability model via the filter (| is also an alias).

def filter(predicate: Region[A])(implicit fieldV: Field[V]): M[A, V]

Syntax is available via this import

import axle.syntax.bayes._

filter -- along with probabilityOf from Kolomogorov -- allows Bayes' Theorem to be expressed and checked with ScalaCheck.

model.filter(b).P(a) * model.P(b) === model.filter(a).P(b) * model.P(a)

For non-zero model.P(a) and model.P(b)

The theorem is more recognizable as P(A|B) = P(B|A) * P(A) / P(B)

Filter is easier to motivate with composite types, but two examples with a d6 show the expected semantics:

Filtering the d6 roll model to 1 and 5:

d6.filter(RegionIf(_ % 4 == 1))
// res9: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
//   p = Map(5 -> 1/2, 1 -> 1/2)
// )

Filter the d6 roll model to 1, 2, and 3:

d6.filter(RegionLTE(3))
// res10: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
//   p = Map(1 -> 1/3, 2 -> 1/3, 3 -> 1/3)
// )

Probability Model as Monads

The pure, map, and flatMap methods of cats.Monad are defined for ConditionalProbabilityTable, TallyDistribution.

Monad Laws

The short version is that the three methods are constrained by a few laws that make them very useful for composing programs. Those laws are:

Left identity: pure(x).flatMap(f) === f(x)
Right identity: model.flatMap(pure) === model
Associativity: model.flatMap(f).flatMap(g) === model.flatMap(f.flatMap(g))

Monad Syntax

There is syntax support in cats.implicits._ for all three methods.

However, due to limitations of Scala's type inference, it cannot see ConditionalProbabilityTable[E, V] as the M[_] expected by Monad.

The most straigtfoward workaround is just to conjure the monad witness directly and use it, passing the model in as the sole argument to the first parameter group.

val monad = ConditionalProbabilityTable.monadWitness[Rational]

monad.flatMap(d6) { a => monad.map(d6) { b => a + b } }
// res11: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
//   p = HashMap(
//     5 -> 1/9,
//     10 -> 1/12,
//     6 -> 5/36,
//     9 -> 1/9,
//     2 -> 1/36,
//     12 -> 1/36,
//     7 -> 1/6,
//     3 -> 1/18,
//     11 -> 1/18,
//     8 -> 5/36,
//     4 -> 1/12
//   )
// )

Another strategy to use map and flatMap directly on the model is a type that can be seen as M[_] along with a type annotation:

type CPTR[E] = ConditionalProbabilityTable[E, Rational]

(d6: CPTR[Int]).flatMap { a => (d6: CPTR[Int]).map { b => a + b } }

Or similar to use a for comprehension:

for {
  a <- d6: CPTR[Int]
  b <- d6: CPTR[Int]
} yield a + b

Chaining models

Chain two events' models

val bothCoinsModel = monad.flatMap(fairCoin) { flip1 =>
  monad.map(fairCoin) { flip2 =>
    (flip1, flip2)
  }
}
// bothCoinsModel: ConditionalProbabilityTable[(Symbol, Symbol), Rational] = ConditionalProbabilityTable(
//   p = HashMap(
//     ('HEAD, 'HEAD) -> 1/4,
//     ('TAIL, 'HEAD) -> 1/4,
//     ('TAIL, 'TAIL) -> 1/4,
//     ('HEAD, 'TAIL) -> 1/4
//   )
// )

This creates a model on events of type (Symbol, Symbol)

It can be queried with P using RegionIf to check fields within the Tuple2.

type TWOFLIPS = (Symbol, Symbol)

bothCoinsModel.P(RegionIf[TWOFLIPS](_._1 == head) and RegionIf[TWOFLIPS](_._2 == head))
// res14: Rational = 1/4

bothCoinsModel.P(RegionIf[TWOFLIPS](_._1 == head) or RegionIf[TWOFLIPS](_._2 == head))
// res15: Rational = 3/4

Summing two dice rolls

val twoDiceSummed = monad.flatMap(d6) { a =>
  monad.map(d6) { b =>
    a + b
  }
}
// twoDiceSummed: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
//   p = HashMap(
//     5 -> 1/9,
//     10 -> 1/12,
//     6 -> 5/36,
//     9 -> 1/9,
//     2 -> 1/36,
//     12 -> 1/36,
//     7 -> 1/6,
//     3 -> 1/18,
//     11 -> 1/18,
//     8 -> 5/36,
//     4 -> 1/12
//   )
// )

Create a visualization

val monadicChart = BarChart[Int, Rational, ConditionalProbabilityTable[Int, Rational], String](
  () => twoDiceSummed,
  colorOf = _ => Color.blue,
  xAxis = Some(Rational(0)),
  title = Some("d6 + d6"),
  labelAngle = Some(0d *: angleDouble.degree),
  drawKey = false)

Create SVG

monadicChart.svg[IO]("docwork/images/distributionMonad.svg").unsafeRunSync()

Monadic d6 + d6

Iffy

A stochastic version of if (aka iffy) can be implemented in terms of flatMap using this pattern for any probability model type M[A] such that a Monad is defined.

def iffy[A, B, M[_]: Monad](
  input      : M[A],
  predicate  : A => Boolean,
  trueClause : M[B],
  falseClause: M[B]): M[B] =
  input.flatMap { i =>
    if( predicate(i) ) {
      trueClause
    } else {
      falseClause
    }
  }

An example of that pattern: "if heads, d6+d6, otherwise d10+d10"

import cats.Eq

val headsD6D6taildD10D10 = monad.flatMap(fairCoin) { side =>
  if( Eq[Symbol].eqv(side, head) ) {
    monad.flatMap(d6) { a => monad.map(d6) { b => a + b } }
  } else {
    monad.flatMap(d10) { a => monad.map(d10) { b => a + b } }
  }
}

Create visualization

val iffyChart = BarChart[Int, Rational, ConditionalProbabilityTable[Int, Rational], String](
  () => headsD6D6taildD10D10,
  colorOf = _ => Color.blue,
  xAxis = Some(Rational(0)),
  title = Some("if heads, d6+d6, else d10+d10"),
  labelAngle = Some(0d *: angleDouble.degree),
  drawKey = false)

Create the SVG

iffyChart.svg[IO]("docwork/images/iffy.svg").unsafeRunSync()

heads => d6+d6, else d10+d10

Further Reading

Motiviating the Monad typeclass is out of scope of this document. Please see the functional programming literature for more about monads and their relationship to functors, applicative functors, monoids, categories, and other structures.

For some historical reading on the origins of probability monads, see the literature on the Giry Monad.

Future work

Measure Theory

Further refining and extending Axle to incorporate Measure Theory is a likely follow-on step.

Markov Categories

As an alternative to Measure Theory, see Tobias Fritz's work on Markov Categories

Probabilistic and Differentiable Programming

In general, the explosion of work on probabilistic and differentible programming is fertile ground for Axle's lawful approach.

Other Goals

{CPT,TD}.tailRecM then ScalaCheck Monad[CPT,TD]
Functor for CPT, TD
SigmaAlgebra for the CPT

Clean up expressions like RegionIf[TWOROLLS](_._1 == '⚃)

Laws for Region ("Sigma Algebra"? video)

OrderedRegion for the Order used in RegionLTE and RegionGTE?
Measure Theory
Rename ConditionalProbabilityTable?
Laws for Factor
Bettings odds
Multi-armed bandit
Recursive grid search
P-values
z & t scores
Correlation
Regression
Accuracy, Precision
Bias, Variance
Cohen's Kappa
Rm throws from axle.stats.TallyDistribution
do-calculus (Causality)
Stochastic Lambda Calculus
Abadi Plotkin pathology
Jacobian Vector Products (JVP)
FLDR probability

probcomp github

MIT FSAAD slides

Docs

Reorder Probability mdoc (Creation, Kolmogorov/Region, Sampler, Bayes, Monad)?

Footnotes (Giry, etc)