Probability Model
Modeling probability, randomness, and uncertainly is one of the primary objectives of Axle.
The capabilies are available via four typeclasses and one trait
- Sampler
- Region (trait modeling σ-algebra)
- Kolmogorov
- Bayes
- Monad (
cats.Monad
)
Concrete number type are avoided in favor of structures from Abstract Algebra --
primarily Ring
and Field
.
These are represented as context bounds, usually passed implicitly.
The examples in this document use the spire.math.Rational
number type, but work as well
for Double
, Float
, etc.
The precise number type Rational
is used in tests because their precision allows the assertions to be expressed without any error tolerance.
Imports
Preamble to pull in the commonly-used functions in this section:
import cats.implicits._
import cats.effect._
import spire.math._
import spire.algebra._
import axle.probability._
import axle.algebra._
import axle.visualize._
import axle.web._
Creating Probability Models
There are a few type of probability models in Axle.
The simplest is the ConditionalProbabilityTable
, which is used throughout this document.
axle.data.Coin.flipModel
demonstrates a very simple probability model for type Symbol
.
This is its implementation:
val head = Symbol("HEAD")
val tail = Symbol("TAIL")
def flipModel(pHead: Rational = Rational(1, 2)): ConditionalProbabilityTable[Symbol, Rational] =
ConditionalProbabilityTable[Symbol, Rational](
Map(
head -> pHead,
tail -> (1 - pHead)))
Its argument is the bias for the HEAD
side.
Without a provided bias, it is assumed to be a fair coin.
val fairCoin = flipModel()
// fairCoin: ConditionalProbabilityTable[Symbol, Rational] = ConditionalProbabilityTable(
// p = Map('HEAD -> 1/2, 'TAIL -> 1/2)
// )
val biasedCoin = flipModel(Rational(9, 10))
// biasedCoin: ConditionalProbabilityTable[Symbol, Rational] = ConditionalProbabilityTable(
// p = Map('HEAD -> 9/10, 'TAIL -> 1/10)
// )
Rolls of dice are another common example.
def rollModel(n: Int): ConditionalProbabilityTable[Int, Rational] =
ConditionalProbabilityTable(
(1 to n).map(i => (i, Rational(1, n.toLong))).toMap)
The values d6
and d10
model rolls of 6 and 10-sided dice.
val d6 = rollModel(6)
// d6: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = HashMap(5 -> 1/6, 1 -> 1/6, 6 -> 1/6, 2 -> 1/6, 3 -> 1/6, 4 -> 1/6)
// )
val d10 = rollModel(10)
// d10: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = HashMap(
// 5 -> 1/10,
// 10 -> 1/10,
// 1 -> 1/10,
// 6 -> 1/10,
// 9 -> 1/10,
// 2 -> 1/10,
// 7 -> 1/10,
// 3 -> 1/10,
// 8 -> 1/10,
// 4 -> 1/10
// )
// )
Define a visualization of the distribution of events in the d6
model:
val d6vis = BarChart[Int, Rational, ConditionalProbabilityTable[Int, Rational], String](
() => d6,
colorOf = _ => Color.blue,
xAxis = Some(Rational(0)),
title = Some("d6"),
labelAngle = Some(0d *: angleDouble.degree),
drawKey = false)
Create an SVG
d6vis.svg[IO]("docwork/images/d6.svg").unsafeRunSync()
Sampler
The Sampler
typeclass provides the ability to "execute" the model and product
a random sample via the sample
method.
It's type signature is:
def sample(gen: Generator)(implicit spireDist: Dist[V], ringV: Ring[V], orderV: Order[V]): A
These imports make available a Generator
as source of entropy
import spire.random._
val rng = Random.generatorFromSeed(Seed(42))
// rng: spire.random.rng.Cmwc5 = spire.random.rng.Cmwc5@3fae93b6
And then the .sample
syntax:
import axle.syntax.sampler._
sample
requires a Spire Generator
.
It also requires context bounds on the value type V
that give the method
the ability to produces values with a distribution conforming to the probability model.
(1 to 10) map { _ => fairCoin.sample(rng) }
// res1: IndexedSeq[Symbol] = Vector(
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'TAIL,
// 'TAIL,
// 'HEAD,
// 'TAIL,
// 'TAIL,
// 'TAIL,
// 'HEAD
// )
(1 to 10) map { _ => biasedCoin.sample(rng) }
// res2: IndexedSeq[Symbol] = Vector(
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD,
// 'HEAD
// )
(1 to 10) map { _ => d6.sample(rng) }
// res3: IndexedSeq[Int] = Vector(3, 2, 5, 2, 6, 2, 6, 1, 1, 5)
Simulate 1k rolls of one d6
val rolls = (0 until 1000) map { _ => d6.sample(rng) }
Then tally them
implicit val ringInt: CRing[Int] = spire.implicits.IntAlgebra
import axle.syntax.talliable._
val oneKd6Histogram = rolls.tally
// oneKd6Histogram: Map[Int, Int] = Map(
// 5 -> 167,
// 1 -> 160,
// 6 -> 176,
// 2 -> 171,
// 3 -> 164,
// 4 -> 162
// )
Create a visualization
val d6oneKvis = BarChart[Int, Int, Map[Int, Int], String](
() => oneKd6Histogram,
colorOf = _ => Color.blue,
xAxis = Some(0),
title = Some("1,000 d6 samples"),
labelAngle = Some(0d *: angleDouble.degree),
drawKey = false)
Create the SVG
d6oneKvis.svg[IO]("docwork/images/d6-1Ksamples.svg").unsafeRunSync()
Sigma Algebra Regions
The sealed Region[A]
trait is extended by the following case classes
that form a way to describe expressions on the event-space of a probability model.
In Measure Theory, these expressions are said to form a "sigma-algebra" ("σ-algebra")
In order of arity, they case classes extending this trait are:
Arity 0
RegionEmpty
never matches any events or probability massRegionAll
always matches all events and probability mass
Arity 1 (not including typeclass witnesses)
RegionEq
matches when an event is equal to the supplied object, with respect to the suppliedcats.kernel.Eq[A]
witness.RegionIf
matches when the supplied condition function returnstrue
RegionSet
matches when an event is contained within the suppliedSet[A]
.RegionNegate
negates the supplied region.RegionGTE
is true when an event is greater than or equal to the supplied object, with respect to the suppliedcats.kernel.Order[A]
RegionLTE
is true when an event is less than or equal to the supplied object, with respect to the suppliedcats.kernel.Order[A]
Arity 2
RegionAnd
is the conjunction of both arguments. It can be created using theand
method in theRegion
trait.RegionOr
is the disjunction of both arguments. It can be created using theor
method in theRegion
trait.
Note that a "Random Variable" does not appear in this discussion.
The axle.probability.Variable
class does define a is
method that returns a RegionEq
,
but the probability models themselves are not concerned with the notion of a
Variable
.
They are simply models over regions of events on their single, opaque type
that adhere to the laws of probability.
The eventual formalization of Region
should connect it with a σ-algebra from Meaasure Theory.
Kolmogorov for querying Probability Models
probabilityOf (aka "P")
The method probabilityOf
is defined in terms of a Region
.
def probabilityOf(predicate: Region[A])(implicit fieldV: Field[V]): V
Note that probabilityOf
is aliased to P
in axle.syntax.kolmogorov._
import axle.syntax.kolmogorov._
The probability of a head
for a single toss of a fair coin is 1/2
fairCoin.P(RegionEq(head))
// res5: Rational = 1/2
The probability that a toss is not head
is also 1/2
.
fairCoin.P(RegionNegate(RegionEq(head)))
// res6: Rational = 1/2
The probability that a toss is both head
and tail
is zero.
fairCoin.P(RegionEq(head) and RegionEq(tail))
// res7: Rational = 0
The probability that a toss is either head
or tail
is one.
fairCoin.P(RegionEq(head) or RegionEq(tail))
// res8: Rational = 1
Kolmogorov's Axioms
The single probabilityOf
method together with the Region
trait
is enough to define Kolmogorov's Axioms of Probability.
The axioms are implemented in axle.laws.KolmogorovProbabilityAxioms
and
checked during testing with ScalaCheck.
Basic Measure
Probabilities are non-negative.
model.P(region) >= Field[V].zero
Unit Measure
The sum the probabilities of all possible events is one
model.P(RegionAll()) === Field[V].one
Combination
For disjoint event regions, e1
and e2
, the probability of their disjunction e1 or e2
is equal to the sum of their independent probabilities.
(!((e1 and e2) === RegionEmpty() )) || (model.P(e1 or e2) === model.P(e1) + model.P(e2))
Bayes Theorem, Conditioning, and Filtering
The Bayes
typeclass implements the conditioning of a probability model
via the filter
(|
is also an alias).
def filter(predicate: Region[A])(implicit fieldV: Field[V]): M[A, V]
Syntax is available via this import
import axle.syntax.bayes._
filter
-- along with probabilityOf
from Kolomogorov
-- allows Bayes' Theorem
to be expressed and checked with ScalaCheck.
model.filter(b).P(a) * model.P(b) === model.filter(a).P(b) * model.P(a)
For non-zero model.P(a)
and model.P(b)
The theorem is more recognizable as P(A|B) = P(B|A) * P(A) / P(B)
Filter is easier to motivate with composite types, but two examples with a d6 show the expected semantics:
Filtering the d6 roll model to 1 and 5:
d6.filter(RegionIf(_ % 4 == 1))
// res9: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = Map(5 -> 1/2, 1 -> 1/2)
// )
Filter the d6 roll model to 1, 2, and 3:
d6.filter(RegionLTE(3))
// res10: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = Map(1 -> 1/3, 2 -> 1/3, 3 -> 1/3)
// )
Probability Model as Monads
The pure
, map
, and flatMap
methods of cats.Monad
are defined
for ConditionalProbabilityTable
, TallyDistribution
.
Monad Laws
The short version is that the three methods are constrained by a few laws that make them very useful for composing programs. Those laws are:
- Left identity:
pure(x).flatMap(f) === f(x)
- Right identity:
model.flatMap(pure) === model
- Associativity:
model.flatMap(f).flatMap(g) === model.flatMap(f.flatMap(g))
Monad Syntax
There is syntax support in cats.implicits._
for all three methods.
However, due to limitations of Scala's type inference, it cannot see
ConditionalProbabilityTable[E, V]
as the M[_]
expected by Monad
.
The most straigtfoward workaround is just to conjure the monad witness directly and use it, passing the model in as the sole argument to the first parameter group.
val monad = ConditionalProbabilityTable.monadWitness[Rational]
monad.flatMap(d6) { a => monad.map(d6) { b => a + b } }
// res11: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = HashMap(
// 5 -> 1/9,
// 10 -> 1/12,
// 6 -> 5/36,
// 9 -> 1/9,
// 2 -> 1/36,
// 12 -> 1/36,
// 7 -> 1/6,
// 3 -> 1/18,
// 11 -> 1/18,
// 8 -> 5/36,
// 4 -> 1/12
// )
// )
Another strategy to use map
and flatMap
directly on
the model is a type that can be seen as M[_]
along with
a type annotation:
type CPTR[E] = ConditionalProbabilityTable[E, Rational]
(d6: CPTR[Int]).flatMap { a => (d6: CPTR[Int]).map { b => a + b } }
Or similar to use a for comprehension:
for {
a <- d6: CPTR[Int]
b <- d6: CPTR[Int]
} yield a + b
Chaining models
Chain two events' models
val bothCoinsModel = monad.flatMap(fairCoin) { flip1 =>
monad.map(fairCoin) { flip2 =>
(flip1, flip2)
}
}
// bothCoinsModel: ConditionalProbabilityTable[(Symbol, Symbol), Rational] = ConditionalProbabilityTable(
// p = HashMap(
// ('HEAD, 'HEAD) -> 1/4,
// ('TAIL, 'HEAD) -> 1/4,
// ('TAIL, 'TAIL) -> 1/4,
// ('HEAD, 'TAIL) -> 1/4
// )
// )
This creates a model on events of type (Symbol, Symbol)
It can be queried with P
using RegionIf
to check fields within the Tuple2
.
type TWOFLIPS = (Symbol, Symbol)
bothCoinsModel.P(RegionIf[TWOFLIPS](_._1 == head) and RegionIf[TWOFLIPS](_._2 == head))
// res14: Rational = 1/4
bothCoinsModel.P(RegionIf[TWOFLIPS](_._1 == head) or RegionIf[TWOFLIPS](_._2 == head))
// res15: Rational = 3/4
Summing two dice rolls
val twoDiceSummed = monad.flatMap(d6) { a =>
monad.map(d6) { b =>
a + b
}
}
// twoDiceSummed: ConditionalProbabilityTable[Int, Rational] = ConditionalProbabilityTable(
// p = HashMap(
// 5 -> 1/9,
// 10 -> 1/12,
// 6 -> 5/36,
// 9 -> 1/9,
// 2 -> 1/36,
// 12 -> 1/36,
// 7 -> 1/6,
// 3 -> 1/18,
// 11 -> 1/18,
// 8 -> 5/36,
// 4 -> 1/12
// )
// )
Create a visualization
val monadicChart = BarChart[Int, Rational, ConditionalProbabilityTable[Int, Rational], String](
() => twoDiceSummed,
colorOf = _ => Color.blue,
xAxis = Some(Rational(0)),
title = Some("d6 + d6"),
labelAngle = Some(0d *: angleDouble.degree),
drawKey = false)
Create SVG
monadicChart.svg[IO]("docwork/images/distributionMonad.svg").unsafeRunSync()
Iffy
A stochastic version of if
(aka iffy
) can be implemented in terms of flatMap
using this pattern for any probability model type M[A]
such that a Monad
is defined.
def iffy[A, B, M[_]: Monad](
input : M[A],
predicate : A => Boolean,
trueClause : M[B],
falseClause: M[B]): M[B] =
input.flatMap { i =>
if( predicate(i) ) {
trueClause
} else {
falseClause
}
}
An example of that pattern: "if heads, d6+d6, otherwise d10+d10"
import cats.Eq
val headsD6D6taildD10D10 = monad.flatMap(fairCoin) { side =>
if( Eq[Symbol].eqv(side, head) ) {
monad.flatMap(d6) { a => monad.map(d6) { b => a + b } }
} else {
monad.flatMap(d10) { a => monad.map(d10) { b => a + b } }
}
}
Create visualization
val iffyChart = BarChart[Int, Rational, ConditionalProbabilityTable[Int, Rational], String](
() => headsD6D6taildD10D10,
colorOf = _ => Color.blue,
xAxis = Some(Rational(0)),
title = Some("if heads, d6+d6, else d10+d10"),
labelAngle = Some(0d *: angleDouble.degree),
drawKey = false)
Create the SVG
iffyChart.svg[IO]("docwork/images/iffy.svg").unsafeRunSync()
Further Reading
Motiviating the Monad typeclass is out of scope of this document. Please see the functional programming literature for more about monads and their relationship to functors, applicative functors, monoids, categories, and other structures.
For some historical reading on the origins of probability monads, see the literature on the Giry Monad.
Future work
Measure Theory
Further refining and extending Axle to incorporate Measure Theory is a likely follow-on step.
Markov Categories
As an alternative to Measure Theory, see Tobias Fritz's work on Markov Categories
Probabilistic and Differentiable Programming
In general, the explosion of work on probabilistic and differentible programming is fertile ground for Axle's lawful approach.
Other Goals
-
{CPT,TD}.tailRecM
then ScalaCheckMonad[CPT,TD]
-
Functor for CPT, TD
-
SigmaAlgebra
for the CPT
- Clean up expressions like
RegionIf[TWOROLLS](_._1 == '⚃)
- Laws for
Region
("Sigma Algebra"? video)
-
OrderedRegion
for theOrder
used inRegionLTE
andRegionGTE
? -
Measure Theory
-
Rename
ConditionalProbabilityTable
? -
Laws for
Factor
-
Bettings odds
-
Multi-armed bandit
-
Recursive grid search
-
P-values
-
z & t scores
-
Correlation
-
Regression
-
Accuracy, Precision
-
Bias, Variance
-
Cohen's Kappa
-
Rm throws from axle.stats.TallyDistribution
-
do-calculus (Causality)
-
Stochastic Lambda Calculus
-
Abadi Plotkin pathology
-
Jacobian Vector Products (JVP)
-
FLDR probability
Docs
- Reorder Probability mdoc (Creation, Kolmogorov/Region, Sampler, Bayes, Monad)?
- Footnotes (Giry, etc)