# Two principles of Bayesian reasoning

Bayes’ rule is a pretty simple piece of mathematics, and it’s extraordinary to me the amount of deep insight that can be plumbed by looking closely at it and considering its implications.

## Principle 1: The surprisingness of an observation is proportional to the amount of evidence it provides.

Evidence that you expect to observe is weak evidence, while evidence that is unexpected is strong evidence.

This follows directly from Bayes’ theorem:

If E is very unexpected, then P(E) is very small. This puts an upwards pressure on the posterior probability, entailing a large belief update. If E is thoroughly unsurprising, then P(E) is near 1, which means that this upward pressure is not there.

A more precise way to say this is to talk about how surprising evidence is given a particular theory.

On the left is a term that (1) is large when E provides strong evidence for H, (2) is near zero when it provides strong evidence against H, and (3) is near 1 when it provides weak evidence regarding H.

On the right is a term that (1) is large if E is very unsurprising given H, (2) is near zero when E is very surprising given H, and (3) is near 1 when E is not made much more surprising or unsurprising by H.

What we get is that (1) E provides strong evidence for H when E is very unsurprising given H, (2) E provides strong evidence against H when it is very surprising given H, and (3) E provides weak evidence regarding H when it is not much more surprising or unsurprising given H.

This makes a lot of sense when you think through it. Theories that make strong and surprising predictions that turn out to be right, are given stronger evidential weight than theories that make weak and unsurprising predictions.