Probability Model
Modeling probability, randomness, and uncertainly is one of the primary objectives of Axle.
The capabilies are available via four typeclasses and one trait
- Sampler
- Region (trait modeling σ-algebra)
- Kolmogorov
- Bayes
- Monad (
cats.Monad)
Concrete number type are avoided in favor of structures from Abstract Algebra --
primarily Ring and Field.
These are represented as context bounds, usually passed implicitly.
The examples in this document use the spire.math.Rational number type, but work as well
for Double, Float, etc.
The precise number type Rational is used in tests because their precision allows the assertions to be expressed without any error tolerance.
Imports
Preamble to pull in the commonly-used functions in this section:
import cats.implicits._
import cats.effect._
import spire.math._
import spire.algebra._
import axle.probability._
import axle.algebra._
import axle.visualize._
import axle.web._
Creating Probability Models
There are a few type of probability models in Axle.
The simplest is the ConditionalProbabilityTable, which is used throughout this document.
axle.data.Coin.flipModel demonstrates a very simple probability model for type Symbol.
This is its implementation:
val head = Symbol("HEAD")
val tail = Symbol("TAIL")
def flipModel(pHead: Rational = Rational(1, 2)): ConditionalProbabilityTable[Symbol, Rational] =
ConditionalProbabilityTable[Symbol, Rational](
Map(
head -> pHead,
tail -> (1 - pHead)))
Its argument is the bias for the HEAD side.
Without a provided bias, it is assumed to be a fair coin.
val fairCoin = flipModel()
// fairCoin: ConditionalProbabilityTable[Symbol, Rational] = ConditionalProbabilityTable(
// p = Map('HEAD -> 1/2, 'TAIL -> 1/2)
// )
val biasedCoin = flipModel(Rational(9, 10))
// biasedCoin: ConditionalProbabilityTable[Symbol, Rational] = ConditionalProbabilityTable(
// p = Map('HEAD -> 9/10, 'TAIL -> 1/10)
// )
Rolls of dice are another common example.
def rollModel(n: Int): ConditionalProbabilityTable[Int, Rational] =
ConditionalProbabilityTable(
(1 to n).map(i => (i, Rational(1, n.toLong))).toMap)
The values d6 and d10 model rolls of 6 and 10-sided dice.
val d6 = rollModel(6)
// d6: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = HashMap(5 -> 1/6, 1 -> 1/6, 6 -> 1/6, 2 -> 1/6, 3 -> 1/6, 4 -> 1/6)
// )
val d10 = rollModel(10)
// d10: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = HashMap(
// 5 -> 1/10,
// 10 -> 1/10,
// 1 -> 1/10,
// 6 -> 1/10,
// 9 -> 1/10,
// 2 -> 1/10,
// 7 -> 1/10,
// 3 -> 1/10,
// 8 -> 1/10,
// 4 -> 1/10
// )
// )
Define a visualization of the distribution of events in the d6 model:
val d6vis = BarChart[Int, Rational, ConditionalProbabilityTable[Int, Rational], String](
() => d6,
colorOf = _ => Color.blue,
xAxis = Some(Rational(0)),
title = Some("d6"),
labelAngle = Some(0d *: angleDouble.degree),
drawKey = false)
Create an SVG
d6vis.svg[IO]("docwork/images/d6.svg").unsafeRunSync()
Sampler
The Sampler typeclass provides the ability to "execute" the model and product
a random sample via the sample method.
It's type signature is:
def sample(gen: Generator)(implicit spireDist: Dist[V], ringV: Ring[V], orderV: Order[V]): A
These imports make available a Generator as source of entropy
import spire.random._
val rng = Random.generatorFromSeed(Seed(42))
// rng: spire.random.rng.Cmwc5 = spire.random.rng.Cmwc5@3fae93b6
And then the .sample syntax:
import axle.syntax.sampler._
sample requires a Spire Generator.
It also requires context bounds on the value type V that give the method
the ability to produces values with a distribution conforming to the probability model.
(1 to 10) map { _ => fairCoin.sample(rng) }
// res1: IndexedSeq[Symbol] = Vector(
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'TAIL,
// 'TAIL,
// 'HEAD,
// 'TAIL,
// 'TAIL,
// 'TAIL,
// 'HEAD
// )
(1 to 10) map { _ => biasedCoin.sample(rng) }
// res2: IndexedSeq[Symbol] = Vector(
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD
// )
(1 to 10) map { _ => d6.sample(rng) }
// res3: IndexedSeq[Int] = Vector(3, 2, 5, 2, 6, 2, 6, 1, 1, 5)
Simulate 1k rolls of one d6
val rolls = (0 until 1000) map { _ => d6.sample(rng) }
Then tally them
implicit val ringInt: CRing[Int] = spire.implicits.IntAlgebra
import axle.syntax.talliable._
val oneKd6Histogram = rolls.tally
// oneKd6Histogram: Map[Int, Int] = Map(
// 5 -> 167,
// 1 -> 160,
// 6 -> 176,
// 2 -> 171,
// 3 -> 164,
// 4 -> 162
// )
Create a visualization
val d6oneKvis = BarChart[Int, Int, Map[Int, Int], String](
() => oneKd6Histogram,
colorOf = _ => Color.blue,
xAxis = Some(0),
title = Some("1,000 d6 samples"),
labelAngle = Some(0d *: angleDouble.degree),
drawKey = false)
Create the SVG
d6oneKvis.svg[IO]("docwork/images/d6-1Ksamples.svg").unsafeRunSync()
Sigma Algebra Regions
The sealed Region[A] trait is extended by the following case classes
that form a way to describe expressions on the event-space of a probability model.
In Measure Theory, these expressions are said to form a "sigma-algebra" ("σ-algebra")
In order of arity, they case classes extending this trait are:
Arity 0
RegionEmptynever matches any events or probability massRegionAllalways matches all events and probability mass
Arity 1 (not including typeclass witnesses)
RegionEqmatches when an event is equal to the supplied object, with respect to the suppliedcats.kernel.Eq[A]witness.RegionIfmatches when the supplied condition function returnstrueRegionSetmatches when an event is contained within the suppliedSet[A].RegionNegatenegates the supplied region.RegionGTEis true when an event is greater than or equal to the supplied object, with respect to the suppliedcats.kernel.Order[A]RegionLTEis true when an event is less than or equal to the supplied object, with respect to the suppliedcats.kernel.Order[A]
Arity 2
RegionAndis the conjunction of both arguments. It can be created using theandmethod in theRegiontrait.RegionOris the disjunction of both arguments. It can be created using theormethod in theRegiontrait.
Note that a "Random Variable" does not appear in this discussion.
The axle.probability.Variable class does define a is method that returns a RegionEq,
but the probability models themselves are not concerned with the notion of a
Variable.
They are simply models over regions of events on their single, opaque type
that adhere to the laws of probability.
The eventual formalization of Region should connect it with a σ-algebra from Meaasure Theory.
Kolmogorov for querying Probability Models
probabilityOf (aka "P")
The method probabilityOf is defined in terms of a Region.
def probabilityOf(predicate: Region[A])(implicit fieldV: Field[V]): V
Note that probabilityOf is aliased to P in axle.syntax.kolmogorov._
import axle.syntax.kolmogorov._
The probability of a head for a single toss of a fair coin is 1/2
fairCoin.P(RegionEq(head))
// res5: Rational = 1/2
The probability that a toss is not head is also 1/2.
fairCoin.P(RegionNegate(RegionEq(head)))
// res6: Rational = 1/2
The probability that a toss is both head and tail is zero.
fairCoin.P(RegionEq(head) and RegionEq(tail))
// res7: Rational = 0
The probability that a toss is either head or tail is one.
fairCoin.P(RegionEq(head) or RegionEq(tail))
// res8: Rational = 1
Kolmogorov's Axioms
The single probabilityOf method together with the Region trait
is enough to define Kolmogorov's Axioms of Probability.
The axioms are implemented in axle.laws.KolmogorovProbabilityAxioms and
checked during testing with ScalaCheck.
Basic Measure
Probabilities are non-negative.
model.P(region) >= Field[V].zero
Unit Measure
The sum the probabilities of all possible events is one
model.P(RegionAll()) === Field[V].one
Combination
For disjoint event regions, e1 and e2, the probability of their disjunction e1 or e2
is equal to the sum of their independent probabilities.
(!((e1 and e2) === RegionEmpty() )) || (model.P(e1 or e2) === model.P(e1) + model.P(e2))
Bayes Theorem, Conditioning, and Filtering
The Bayes typeclass implements the conditioning of a probability model
via the filter (| is also an alias).
def filter(predicate: Region[A])(implicit fieldV: Field[V]): M[A, V]
Syntax is available via this import
import axle.syntax.bayes._
filter -- along with probabilityOf from Kolomogorov -- allows Bayes' Theorem
to be expressed and checked with ScalaCheck.
model.filter(b).P(a) * model.P(b) === model.filter(a).P(b) * model.P(a)
For non-zero model.P(a) and model.P(b)
The theorem is more recognizable as P(A|B) = P(B|A) * P(A) / P(B)
Filter is easier to motivate with composite types, but two examples with a d6 show the expected semantics:
Filtering the d6 roll model to 1 and 5:
d6.filter(RegionIf(_ % 4 == 1))
// res9: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = Map(5 -> 1/2, 1 -> 1/2)
// )
Filter the d6 roll model to 1, 2, and 3:
d6.filter(RegionLTE(3))
// res10: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = Map(1 -> 1/3, 2 -> 1/3, 3 -> 1/3)
// )
Probability Model as Monads
The pure, map, and flatMap methods of cats.Monad are defined
for ConditionalProbabilityTable, TallyDistribution.
Monad Laws
The short version is that the three methods are constrained by a few laws that make them very useful for composing programs. Those laws are:
- Left identity:
pure(x).flatMap(f) === f(x) - Right identity:
model.flatMap(pure) === model - Associativity:
model.flatMap(f).flatMap(g) === model.flatMap(f.flatMap(g))
Monad Syntax
There is syntax support in cats.implicits._ for all three methods.
However, due to limitations of Scala's type inference, it cannot see
ConditionalProbabilityTable[E, V] as the M[_] expected by Monad.
The most straigtfoward workaround is just to conjure the monad witness directly and use it, passing the model in as the sole argument to the first parameter group.
val monad = ConditionalProbabilityTable.monadWitness[Rational]
monad.flatMap(d6) { a => monad.map(d6) { b => a + b } }
// res11: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = HashMap(
// 5 -> 1/9,
// 10 -> 1/12,
// 6 -> 5/36,
// 9 -> 1/9,
// 2 -> 1/36,
// 12 -> 1/36,
// 7 -> 1/6,
// 3 -> 1/18,
// 11 -> 1/18,
// 8 -> 5/36,
// 4 -> 1/12
// )
// )
Another strategy to use map and flatMap directly on
the model is a type that can be seen as M[_] along with
a type annotation:
type CPTR[E] = ConditionalProbabilityTable[E, Rational]
(d6: CPTR[Int]).flatMap { a => (d6: CPTR[Int]).map { b => a + b } }
Or similar to use a for comprehension:
for {
a <- d6: CPTR[Int]
b <- d6: CPTR[Int]
} yield a + b
Chaining models
Chain two events' models
val bothCoinsModel = monad.flatMap(fairCoin) { flip1 =>
monad.map(fairCoin) { flip2 =>
(flip1, flip2)
}
}
// bothCoinsModel: ConditionalProbabilityTable[(Symbol, Symbol), Rational] = ConditionalProbabilityTable(
// p = HashMap(
// ('HEAD, 'HEAD) -> 1/4,
// ('TAIL, 'HEAD) -> 1/4,
// ('TAIL, 'TAIL) -> 1/4,
// ('HEAD, 'TAIL) -> 1/4
// )
// )
This creates a model on events of type (Symbol, Symbol)
It can be queried with P using RegionIf to check fields within the Tuple2.
type TWOFLIPS = (Symbol, Symbol)
bothCoinsModel.P(RegionIf[TWOFLIPS](_._1 == head) and RegionIf[TWOFLIPS](_._2 == head))
// res14: Rational = 1/4
bothCoinsModel.P(RegionIf[TWOFLIPS](_._1 == head) or RegionIf[TWOFLIPS](_._2 == head))
// res15: Rational = 3/4
Summing two dice rolls
val twoDiceSummed = monad.flatMap(d6) { a =>
monad.map(d6) { b =>
a + b
}
}
// twoDiceSummed: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = HashMap(
// 5 -> 1/9,
// 10 -> 1/12,
// 6 -> 5/36,
// 9 -> 1/9,
// 2 -> 1/36,
// 12 -> 1/36,
// 7 -> 1/6,
// 3 -> 1/18,
// 11 -> 1/18,
// 8 -> 5/36,
// 4 -> 1/12
// )
// )
Create a visualization
val monadicChart = BarChart[Int, Rational, ConditionalProbabilityTable[Int, Rational], String](
() => twoDiceSummed,
colorOf = _ => Color.blue,
xAxis = Some(Rational(0)),
title = Some("d6 + d6"),
labelAngle = Some(0d *: angleDouble.degree),
drawKey = false)
Create SVG
monadicChart.svg[IO]("docwork/images/distributionMonad.svg").unsafeRunSync()
Iffy
A stochastic version of if (aka iffy) can be implemented in terms of flatMap
using this pattern for any probability model type M[A] such that a Monad is defined.
def iffy[A, B, M[_]: Monad](
input : M[A],
predicate : A => Boolean,
trueClause : M[B],
falseClause: M[B]): M[B] =
input.flatMap { i =>
if( predicate(i) ) {
trueClause
} else {
falseClause
}
}
An example of that pattern: "if heads, d6+d6, otherwise d10+d10"
import cats.Eq
val headsD6D6taildD10D10 = monad.flatMap(fairCoin) { side =>
if( Eq[Symbol].eqv(side, head) ) {
monad.flatMap(d6) { a => monad.map(d6) { b => a + b } }
} else {
monad.flatMap(d10) { a => monad.map(d10) { b => a + b } }
}
}
Create visualization
val iffyChart = BarChart[Int, Rational, ConditionalProbabilityTable[Int, Rational], String](
() => headsD6D6taildD10D10,
colorOf = _ => Color.blue,
xAxis = Some(Rational(0)),
title = Some("if heads, d6+d6, else d10+d10"),
labelAngle = Some(0d *: angleDouble.degree),
drawKey = false)
Create the SVG
iffyChart.svg[IO]("docwork/images/iffy.svg").unsafeRunSync()
Further Reading
Motiviating the Monad typeclass is out of scope of this document. Please see the functional programming literature for more about monads and their relationship to functors, applicative functors, monoids, categories, and other structures.
For some historical reading on the origins of probability monads, see the literature on the Giry Monad.
Future work
Measure Theory
Further refining and extending Axle to incorporate Measure Theory is a likely follow-on step.
Markov Categories
As an alternative to Measure Theory, see Tobias Fritz's work on Markov Categories
Probabilistic and Differentiable Programming
In general, the explosion of work on probabilistic and differentible programming is fertile ground for Axle's lawful approach.
Other Goals
-
{CPT,TD}.tailRecMthen ScalaCheckMonad[CPT,TD] -
Functor for CPT, TD
-
SigmaAlgebrafor the CPT
- Clean up expressions like
RegionIf[TWOROLLS](_._1 == '⚃)
- Laws for
Region("Sigma Algebra"? video)
-
OrderedRegionfor theOrderused inRegionLTEandRegionGTE? -
Measure Theory
-
Rename
ConditionalProbabilityTable? -
Laws for
Factor -
Bettings odds
-
Multi-armed bandit
-
Recursive grid search
-
P-values
-
z & t scores
-
Correlation
-
Regression
-
Accuracy, Precision
-
Bias, Variance
-
Cohen's Kappa
-
Rm throws from axle.stats.TallyDistribution
-
do-calculus (Causality)
-
Stochastic Lambda Calculus
-
Abadi Plotkin pathology
-
Jacobian Vector Products (JVP)
-
FLDR probability
Docs
- Reorder Probability mdoc (Creation, Kolmogorov/Region, Sampler, Bayes, Monad)?
- Footnotes (Giry, etc)