# Proving the Completeness of Propositional Logic

The completeness of a logic is a really nice property to establish. For a logic to be complete, it must be that every semantic entailment is also syntactically entailed. Said more simply, it must be that every truth in the language is provable. Gödel’s incompleteness theorems showed us that we cannot have such high hopes for mathematics in general, but we can still establish completeness for some simple logics, such as propositional and first order logic.

I want to post a proof of the completeness of propositional logic here in full for future reference. Roughly the first half of what’s below is just establishing some necessary background, so that this post is fairly self-contained and doesn’t reference lemmas that are proved elsewhere.

The only note I’ll make before diving in is that the notation I(A,P) is a way to denote the smallest set that contains A and is closed under the operations in P. It’s a handy way to inductively define sets that would be enormously complicated to define otherwise. With that out of the way, here we go!

First we define the proof system for propositional logic.

Axiom 1: α→(β→α)
Axiom 2: (α→(β→γ)) → ((α→β)→(α→γ))
Axiom 3: ((¬α)→(¬β)) → (β→α)

The α, β, and γ symbols in these axioms are meant to stand for any well-formed formula. What this means is that we actually have a countable infinity of axioms that fall into the three categories above. For simplicity, I’ll keep calling them “Axioms 1, 2, and 3”, and assume you don’t find it too confusing.

You might also notice that the axioms only involve the symbols → and ¬, but neglect ∧ and ∨. This is okay because → and ¬ are adequate connectives for the semantics of propositional logic (which is to say that any truth function can be expressed in terms of them).

Axioms = {Axiom 1, Axiom 2, Axiom 3}
Deduction rule = Modus Ponens (MP)
….. MP(α, α→β) = β

The set of all provable sentences is just the set of all sentences that includes the axioms and is closed under modus ponens.

Theorems = I(Axioms, MP)

We can also easily talk about the set of sentences that can be proven from assumptions in a set Σ:

Th(Σ) = I(Axioms ∪ Σ, MP)
Notation: Σ ⊢ α iff α ∈ Th(Σ)

With that out of the way, let’s establish some basic but important results about the propositional proof system.

Monotonicity: If Σ ⊆ Σ’, then Th(Σ) ⊆ Th(Σ’).

Strong monotonicity: If Σ ⊢ Σ’, then Th(Σ’) ⊆ Th(Σ).

Intuitively, monotonicity says that if you expand the set of assumptions, you never shrink the set of theorems. Strong monotonicity says that if Σ can prove everything in Σ’, then Σ’ cannot be stronger than Σ. Both of these follow pretty directly from the definition of Th(Σ).

Soundness: If ⊢ α, then ⊨ α.
Proof by structural induction
….. Each axiom is a tautology.
….. Tautology is closed under MP (if ⊨ α and ⊨ (α→β), then ⊨ β).

Extended Soundness: If Σ ⊢ α, then Σ ⊨ α.
Proof by structural induction
….. Σ ⊨ α ∈ Axioms and Σ ⊨ α ∈ Σ.
….. If Σ ⊨ α and Σ ⊨ (α→β), then Σ ⊨ β.

Law of identity: ⊢ (α→α)
….. α→((α→α)→α), Axiom 1
….. (α→((α→α)→α))→((α→(α→α))→(α→α)), Axiom 2
….. (α→(α→α))→(α→α), MP
….. α→(α→α), Axiom 1
….. α→α, MP

Principle of Explosion: If Σ ⊢ α and Σ ⊢ (¬α), then Σ ⊢ β.
….. By strong monotonicity, it suffices to show that Σ ∪ {α} ∪ {¬α} ⊢ β.
……….. (¬α)→((¬β)→(¬α)), Axiom 1
……….. ¬α, Assumption
……….. (¬β)→(¬α), MP
……….. ((¬β)→(¬α))→(α→β), Axiom 3
.………. α→β, MP
……….. α, Assumption
……….. β, MP

And finally, our most important background theorem:

Deduction Theorem: Σ ⊢ (α→β) iff Σ ∪ {α} ⊢ β.
Proof =>
….. Suppose Σ ⊢ (α→β).
….. By monotonicity, Σ ∪ {α} ⊢ (α→β).
….. Also, clearly Σ ∪ {α} ⊢ α.
….. So Σ ∪ {α} ⊢ β.
Proof <=
….. Suppose Σ ∪ {α} ⊢ β.
….. Base cases
………. β ∈ Axioms. (β, β→(α→β), α→β).
………. β ∈ Σ. (β, β→(α→β), α→β).
………. β = α. (⊢ (α→α), so by monotonicity Σ ⊢ (α→α)).
….. Inductive step
………. Suppose Σ ⊢ (α→γ) and Σ ⊢ (α→(γ→𝛿)).
………. By strong monotonicity, suffices to show Σ ∪ {α→γ, α→(γ→𝛿)} ⊢ (α→𝛿)
……………. (α→(γ→𝛿)) → ((α→γ)→(α→𝛿)), Axiom 2
……………. α→(γ→𝛿), Assumption
……………. (α→γ)→(α→𝛿), MP
……………. (α→γ), Assumption
……………. (α→𝛿). MP

Now, let’s go into the main body of the proof. The structure of the proof is actually quite similar to the proof of the compactness theorem I gave previously. First we show that every consistent set of sentences Σ has a maximally consistent extension Σ’. Then show that Σ’ is satisfiable. Now since Σ’ is satisfiable and it’s an extension of Σ, Σ must also be satisfiable. From there it’s a simple matter to show that the logic is complete.

So, let’s define some of the terms I just used.

Σ is consistent iff for no α does Σ ⊢ α and Σ ⊢ (¬α)
….. Equivalently: iff for some α, Σ ⊬ α

Σ is maximally consistent iff Σ is consistent and for every α, either Σ ∪ {α} is inconsistent or Σ ⊢ α.

One final preliminary result regarding consistency before diving into the main section of the proof:

If Σ is satisfiable, then Σ is consistent.
Proof
….. Suppose Σ is inconsistent.
….. Then there’s an α such that Σ ⊢ α and Σ ⊢ (¬α).
….. By extended soundness, Σ ⊨ α and Σ ⊨ (¬α).
….. So Σ is not satisfiable.

This is the converse of the result we actually want, but it’ll come in handy. Now, let’s begin to construct our extension!

Any consistent Σ can be extended to a maximally consistent Σ’
….. Choose any ordering {αn} of well-formed-formulas.
….. Define Σ0 = Σ.
….. Σn+1 = Σn if Σn ⊢ (¬αn+1), and Σn ∪ {αn+1} otherwise.
….. For each n, (i) Σn is consistent and (ii) either Σn ⊢ αn or Σn ⊢ (¬αn)
……….. Base case: Σ0 is consistent by assumption, and (ii) doesn’t apply.
……….. Inductive step: Suppose Σn satisfies (i) and (ii). Two cases:
…………….. If Σn ⊢ (¬αn+1), then Σn+1 = Σn. Clearly consistent and satisfies (ii).
…………….. If Σn ⊬ (¬αn+1), then Σn+1 = Σn ∪ {αn+1}. Clearly satisfies (ii), but is it consistent?
………………….. Suppose not. Then Σn+1 ⊢ (¬αn+1), by explosion.
………………….. So Σn+1 ∪ {αn+1} ⊢ (¬αn+1).
………………….. So Σn+1 ⊢ (αn+1 → (¬αn+1)).
………………….. ⊢ ((α→¬α)→¬α), so Σn+1 ⊢ (¬αn+1). Contradiction!

….. Define Σ’ = ∪ Σn. Σ’ is maximally consistent.
……….. Maximality
…………….. Suppose not. Then for some αn, Σ’ ⊬ αn and Σ’ ∪ {αn} is consistent.
…………….. But Σn ⊆ Σ’, and either Σn ⊢ αn or Σn ⊢ (¬αn).
…………….. If Σn ⊢ αn, by monotonicity Σ ⊢ αn. Contradiction. So Σn ⊢ (¬αn).
…………….. By monotonicity, Σ ⊢ (¬αn), so Σ ∪ {αn} ⊢ (¬αn).
…………….. But Σ ∪ {αn} ⊢ αn. So Σ ∪ {αn} is inconsistent. Contradiction!
……….. Consistency
…………….. Suppose Σ’ is inconsistent. Then for some α, Σ’ ⊢ α and Σ’ ⊢ (¬α).
…………….. So there are proofs of α and (¬α) from Σ’.
…………….. Proofs are finite, so each proof uses only a finite number of assumptions from Σ’.
…………….. So we can choose an n such that Σn contains all the needed assumptions.
…………….. Now both proofs from Σ’ are also proofs from Σn.
…………….. So Σn ⊢ αn and Σ’ ⊢ (¬αn).
…………….. So Σn is inconsistent. Contradiction!

Alright, we’re almost there! So now we have that for any consistent Σ, there’s an extension Σ’ that is maximally consistent. We’ll take it a little further and prove that not only is Σ’ maximally consistent, it’s also complete! (This is the purely syntactic sense of completeness, which is that for every sentence α, either Σ’ proves α or refutes α. This is different from the sense of logical completeness that we’re establishing with the proof.)

Σ’ is complete.
….. Σn ⊆ Σ’, and Σn ⊢ αn or Σn ⊢ (¬αn).
….. So by monotonicity Σ’ ⊢ αn or Σ’ ⊢ (¬αn).

Now we have everything we need to show that Σ’, and thus Σ, is satisfiable.

If Σ is consistent, then Σ is satisfiable.
Proof
….. Let Σ’ be a maximally consistent extension of Σ.
….. Define vΣ’(p) over propositional variables p:
….. VΣ’(p) = T if Σ’ ⊢ p and F if Σ’ ⊬ p
….. ṼΣ’(α) = T iff Σ’ ⊢ α
……….. Base case: Let α be a propositional variable. Then ṼΣ’(α) = T iff Σ’ ⊢ α by definition of VΣ’.
……….. Inductive steps:
……….. (¬α)
…………….. If Σ’ ⊢ (¬α), then by consistency Σ’ ⊬ α, so ṼΣ’(α) = F, so ṼΣ’(¬α) = T.
…………….. If Σ’ ⊬ (¬α), then by completeness Σ’ ⊢ α. So ṼΣ’(α) = T, so ṼΣ’(¬α) = F.
……….. (α→β)
…………….. Suppose Σ’ ⊢ (α→β). By completeness Σ’ ⊢ α or Σ’ ⊢ (¬α).
………………….. If Σ’ ⊢ α, then Σ’ ⊢ β, so ṼΣ’(β) = T, so ṼΣ’(α→β) = T.
………………….. If Σ’ ⊢ (¬α), then ṼΣ’(α) = F, so ṼΣ’(α→β) = T.
……………. Suppose Σ’ ⊬ (α→β).
………………….. By completeness Σ’ ⊢ ¬(α→β).
………………….. ⊢ (β→(α→β)), so Σ’ ⊬ β on pain of contradiction. So ṼΣ'(β) = F.
………………….. Suppose ṼΣ’(α→β) = T. Then ṼΣ’(α) = F, so Σ’ ⊢ (¬α).
………………….. ⊢ (¬α→(α→β)). So Σ’ ⊢ (α→β). Contradiction.
………………….. So ṼΣ’(α→β) = F.
….. So vΣ’ satisfies Σ’.
….. Σ ⊆ Σ’, so vΣ’ satisfies Σ.
….. So Σ is satisfiable!

Now our final result becomes a four-line proof.

If Σ ⊨ α, then Σ ⊢ α.
Proof
….. Suppose Σ ⊬ α.
….. Then Σ ∪ {¬α} is consistent.
….. So Σ ∪ {¬α} is satisfiable.
….. So Σ ⊭ α.

And we’re done! We’ve shown that if any sentence α is semantically entailed by a set of sentences Σ, then it must also be provable from Σ! If you’ve followed this proof all the way, pat yourself on the back.

With the Completeness Theorem in hand, the proof of the Compactness Theorem goes from several pages to a few lines. It’s so nice and simple that I just have to include it here.

If Σ is finitely satisfiable, then Σ is satisfiable.

….. Suppose Σ is not satisfiable.
….. Then Σ is not consistent.
….. So there is some α for which Σ ⊢ α and Σ ⊢ (¬α).
….. Since proofs are finite, there must be some finite subset Σ* of Σ such that Σ* ⊢ α and Σ* ⊢ (¬α).
….. By soundness, Σ* ⊨ α and Σ* ⊨ (¬α).
….. So Σ* is not satisfiable!

In other words, if Σ is not satisfiable, then there’s some finite subset of Σ that’s also not satisfiable. This is the Compactness Theorem! Previously we proved it entirely based off of the semantics of propositional logic, but now we can see that it is also provable as a consequence of the finite nature of our proof system!

# Four Pre-Gödelian Limitations on Mathematics

Even prior to the devastating Incompleteness Theorems there were hints of what was to come. I want to describe and prove four results in mathematical logic that don’t depend on Incompleteness at all, but establish some rather serious limitations on the project of mathematics.

Here are the four. I’ll go through them in order of increasing level of sophistication required to prove them.

• Indescribable sets of possible worlds
• Noncompossibility theorem
• Inevitable nonstandard numbers
• Mysterious missing subsets

# 1. Indescribable sets of possible worlds

I already talked about this one here. The basic idea is that even in our safest and least troublesome logic, propositional calculus, it turns out that the language is insufficient to fully capture all the semantic notions. It’s the first hint at something going awry with syntax and semantics, where the semantics can outpace the syntax and leave axiomatic mathematics behind.

So to recap: the result is that in propositional logic there are sets of truth assignments that can not be “described” by any set of propositional sentences (even allowing infinite sets!). A set of sentences is said to “describe” a set of truth assignments if that set of truth assignments is the unique set of truth assignments consistent with all those sentences being true. If we think of truth assignments as possible worlds, and sets of sentences as descriptions of sets of possible worlds, then this result says that there are sets of possible worlds in propositional semantics that cannot be described by any propositional syntax.

The proof of this is astoundingly simple: just look at the cardinality of the set of descriptions and the cardinality of the set of sets of possible worlds. The second is strictly larger than the first, so any mapping from descriptions to sets of possible worlds will of necessity leave some sets of possible worlds out. In fact, it also tells us that virtually all sets of possible worlds are not describable!

# 2. Noncompossibility Theorem

The noncompossibility theorem is a little-known theorem that establishes a serious limitation on our ability to describe mathematical structures. Here’s what it says. Suppose that you have a description of a countably infinite structure (like, say, the natural numbers, which have a countable infinity of objects) that has the following three properties:

(1) The language has a term for denoting every object in the structure (like 0, 1, 2, 3, 4, and so on)
(2) The axioms in your description are weakly complete: if something is inconsistent with the axioms, it can be proven false.
(3) There is some algorithm for determining whether any given sentence is an axiom.

The noncompossibility theorem tells us that if you have all three of these properties, then your axioms will fail to uniquely pick out your intended structure, and will include models that have extra objects that aren’t in the structure.

Let’s prove this.

We’ll denote the mathematical structure that we’re trying to describe as M and our language as L. We choose L to have sufficient syntactic structure to express the truths of M. From L, we select a decidable set of sentences X with the goal that all these sentences be true of M. We now select a proof system F in L such that for any finite extension L* to L involving only new constant terms, and for any Y ⊆ L*, if X ∪ Y is not satisfiable, then F refutes some finite subset of X ∪ Y.

(As an aside: Why care about this strange weak form of completeness? Well, intuitively all that it’s saying is that our axioms should be able to rule out any set of sentences that are inconsistent with them using some finite proof, as long as those sentences only use finitely many additional constant symbols. This is relatively weak compared to the usual notions of completeness that logicians talk about, which makes it an even better choice for our purposes, as the weaker the axiom the harder to deny.)

Our assumptions can now be written:

(0) |M| is countably infinite.
(1) ∀m ∈ M, ∃tm ∈ terms(X) such that (m = tm)
(2) If we extend L to L* by adding finitely many constant symbols, then for any Y ⊆ L*, if X ∪ Y is not satisfiable, then F refutes some finite subset of X ∪ Y.
(3) X is recursively enumerable.

Our proof starts by adding a new constant term c to our language and constructing an extension of X:

Y = X ∪ {c ≠ tm | m ∈ M}

In other words, Y is X but supplemented with the assertion that there exists an object that isn’t in M. If we can prove that Y is satisfiable, then this entails that X is also satisfiable by the same truth assignment. And this means that there is a model of X in which there are extra objects that aren’t in M.

We proceed with proof by contradiction. Suppose that Y is not satisfiable. Then, by assumption (2), we must be able to refute some finite subset Z of X ∪ Y. But since Z is finite, it involves only finitely many terms tm. And since M is countably infinite, there will always be objects in M that are not equal to any of the chosen terms! So we can’t refute any finite subset of Z! Thus Y is satisfiable.

And if Y is satisfiable, then so must be X, as Y is a superset of X. And since Y is satisfiable, then there’s some truth assignment v that satisfies all of v. But then v also satisfies X, as X is a subset of Y and removing axioms cannot rule out models, only add more! So we’ve proven that X has a model in which there is an object that is not equal to any of the objects in M. That is, X is not categorical: it does not uniquely describe M.

Tennant described this theorem as saying that “in countably infinite realms, you cannot know both where you are and where you are going.” More dully, we cannot have a satisfactory theory of a countably infinite mathematical structure that is both categorical and weakly complete. This isn’t super shocking by today’s standards, but it’s quite cool when you consider how little elaborate theoretical apparatus is required to prove it.

# 3. Inevitable nonstandard numbers

Suppose we have some first-order theory T that models the natural numbers. Take this theory and append to it a new constant symbol c, as well as an infinite axiom schema saying “c > 0” , “c > 1” , “c > 2”, and so on forever. Call this new theory T*.

Does T* have a model? Well by the compactness theorem, it has a model as long as all its finite subsets have a model. And for every finite subset of T*s axioms, the natural numbers are a model! So T* does have a model. Could this model be the natural numbers? Clearly not, because to satisfy T*, there must be a number greater than all the natural numbers. So whatever the model of T* is, it’s not the standard natural numbers. Let’s call it a nonstandard model, and label it ℕ*.

Here’s the final step of the proof: ℕ* is a model of T*, and T* is a superset of T, so ℕ* must also be a model of T! And thus we find that in any logic with a compactness theorem, a theory of the natural numbers will have models with nonstandard numbers that are greater than all of ℕ.

It’s one of my favorite proofs, because it’s so easy to describe and has such a devastating conclusion. It’s also an example of the compactness theorem using the existence of one type of model (ℕ for each of the finite cases) to prove the existence of something entirely different (ℕ* for the infinite case).

# 4. Mysterious missing subsets

The Löwenheim-Skolem theorem tells us that if a first-order theory has a model with an infinite cardinality, then it has models with every infinite cardinality. This places a major restriction on our ability to describe an infinite mathematical structure using first order logic. For if we were to try to single out the natural numbers, say, we would inevitably end up failing to rule out models of our axioms that are the cardinality of the real numbers, or worse, or the set of functions from real numbers to real numbers, and so on for all other possible cardinalities.

When applied to set theory, this implies a result that seems on its face to be a straightforward contradiction. Namely, Löwenheim-Skolem tells us that any first order axiomatization of sets will inevitably have a model that contains only a countable infinity of sets. But this seems bizarre, as all we appear to need to rule out countably infinite universes of sets is one axiom that asserts the existence of a countably infinite set, and another that asserts that admits the power-set of any set to the universe of sets. Then we will be forced to admit that there is a set which is the power set of a countably infinite set, and as Cantor’s famous diagonal argument shows, that this set is uncountably large.

So on the one hand, Cantor tells us that there are sets that contain uncountably many objects. And on the other hand, Löwenheim-Skolem tell us that there is a model of set theory with only countably many objects. This dichotomy is known by the name Skolem’s paradox. It appears to be a straightforward contradiction, but it’s not.

What Skolem realized was that the formal notion of a power set, which is something like “the set P(X) such that for all sets Y, if Y is a subset of X, then Y is an element P(X)”, relies on a quantification over all sets, and that in a countable universe of sets, that quantification ranges over only a countable number of objects. In other words, P(X) is only uncountable if our quantifier ranges over all possible sets, but for a countable model, there are sets that are not describable within the model. This means that the notion of a power set is relative to your model of set theory! In fact, there’s no way in first order logic to unambiguously pin down what you mean by “power set” in such a way that all models will agree on what P(X) actually contains. It also means that the notions of cardinality and countability are relative to your model! In Skolem’s words, “even the notions ‘finite’, ‘infinite’, ’simply infinite sequence’ and so forth turn out to be merely relative within axiomatic set theory.”

# A challenge to constructivists

Constructive mathematicians do not accept a proof of existence unless it provides a recipe for how to construct the thing whose existence is being asserted. Constructive mathematics is quite interesting, but it also appears to have some big problems. Here’s a challenge for constructivists:

Suppose that I hand you some complicated function f from a set A to another set B. I ask you: “Can every element in B be reached by applying the function to an element in A?” In other words, is f surjective?

Now, it so happens that the cardinality of B is greater than the cardinality of A. That’s sufficient to tell us that f can’t be surjective, as however it maps elements there will always be some left over. So we know that the answer is “no, we can’t reach every element in B.” But we proved this without explicitly constructing the particular element in B that can’t be reached! So a constructivist will be left unsatisfied.

The trick is that I’ve made this function extremely complicated, so that there’s no clever way for them to point to exactly which element is missing. Would they say that even though |B| is strictly larger than |A|, it could still be somehow that every element in B is in the image of f? Imagine asking them to bet on this proposition. I don’t think any sane person would put any money on the proposition that f is onto.

And as a final kicker, our sets don’t even have to be infinite! Let |A| = 20 and |B| = 21. I describe a function from A to B, such that actually computing the “missing element” involves having to calculate the 21st Busy Beaver number or something. And the constructivist gets busy searching for the particular element in B that doesn’t get mapped to, instead of just saying “well of course we can’t map 20 elements to 21 elements!”

Even simpler, let F map {1} to {1,2} as follows: F(1) = 1 if the last digit of the 20th busy beaver number is 0, 1, 2, 3, or 4, and F(1) = 2 otherwise. Now to prove constructively that there is an element in {1, 2} that isn’t in the image of F requires knowing the last digit of the 20th busy beaver number, which humans will most likely never be able to calculate (we’re stuck on the fifth one now). So a constructivist will be remain uncertain on the question of if F is surjective.

But a sane person would just say “look, of course F isn’t surjective; it maps one object to two objects. You can’t do this without leaving something out! It doesn’t matter if we don’t know which element is left out, it has to be one of them!”

And if humanity is about to meet an alien civilization with immense computational power that knows all the digits of the 20th busy beaver number, the standard mathematician could bet their entire life savings on F not being surjective at any odds whatsoever, and the constructive mathematician would bet in favor of F being surjective at some odds. And of course, the constructivist would be wrong and lose money! So this also means that you have a way to make money off of any constructivist mathematicians you encounter, so long as we’re about to make contact with advanced aliens.

# Describing the world

Wittgenstein starts his Tractatus Philosophicus with the following two sentences.

1. The world is everything that is the case.

1.1 The world is the totality of facts, not of things.

Let’s take him up on this suggestion and see how far we get. In the process, we’ll discover some deep connections to theorems in mathematical logic, as well as some fascinating limitations on the expressive powers of propositional and first order logic.

We start out with a set of atomic propositions. For a very simple world, we might only need a finite number of these: “Particle 1 out of 3 has property 1 out of 50”, “Particle 2 of 3 has property 17 out of 50”, and so on. More realistically, the set of atomic propositions will be infinite (countable if the universe doesn’t have any continuous properties, and uncountable otherwise).

For simplicity, we’ll imagine labeling our set of atomic propositions P1, P2, P3, and so on (even though this entails that there are at most countably many, nothing important will rest on this assumption.) We combine these atomic propositions with the operators of propositional logic {(, ), ¬, ∧, ∨, →}. This allows us to build up more complicated propositions, like ((P7∧P2)→(¬P13)). This will be the language that we use to describe the world.

Now, the way that the world is is just a consistent assignment of truth values to the set of all grammatical sentences in our language. For example, one simple assignment of truth values is the one that assigns “True” to all atomic propositions. Once we’ve assigned truth values to all the atomic propositions, we get the truth values for the rest of the set of grammatical sentences for free, by the constraint that our truth assignment be consistent. (For instance, if P1 and P2 are both true, then (P1∧P2) must also be true.)

Alright, so the set of ways the world could be corresponds to the set of truth assignments over our atomic propositions. The final ingredient is the notion that we can encode our present knowledge of the world as a set of sentences. Maybe we know by observation that P5 is true, and either P2 or P3 is true but not both. Then to represent this state of knowledge, we can write the following set of sentences:

{P5, (P2∨P3), ¬(P2∧P3)}

Any set of sentences picks out a set of ways the world could be, such that each of these possible worlds is compatible with that knowledge. If you know nothing at all, then the set of sentences representing your knowledge will be the empty set {}, and the set of possible worlds compatible with your knowledge will be the set of all possible worlds (all possible truth assignments). On the other extreme, you might know the truth values of every atomic proposition, in which case your state of knowledge uniquely picks out one possible world.

In general, as you add more sentences to your knowledge-set, you cut out more and more possible worlds. But this is not always true! Ask yourself what the set of possible worlds corresponding to the set {(P1∨¬P1), (P2∨¬P2), (P3∨¬P3)} is. Since each of these sentences is a tautology, no possible worlds are eliminated! So our set of possible worlds is still the set of all worlds.

Now we get to an interesting question: clearly for any knowledge-set of sentences, you can express a set of possible worlds consistent with that knowledge set. But is it the case that for any set of possible worlds, you can find a knowledge-set that uniquely picks it out? If I hand you a set of truth assignment functions and ask you to tell me a set of propositions which are consistent with that set of worlds and ONLY that set, is that always possible? Essentially, what we’re asking is if all sets of possible worlds are describable.

We’ve arrived at the main point of this essay. Take a minute to ponder this and think about whether it’s possible, and why/why not! For clarification, each sentence can only be finitely long. But! You’re allowed to include an infinity of sentences.

(…)

(Spoiler-hiding space…)

(…)

If there were only a finite number of atomic propositions, then you could pick out any set of possible worlds with just a single sentence in conjunctive normal form. But when we start talking about an infinity of atomic propositions, it turns out that it is not always possible! There are sets of possible worlds that are literally not describable, even though our language includes the capacity to describe each of those words and we’re allowed to include an infinite set of sentences.

There’s a super simple proof of this. Let’s give a name to the cardinality of the set of sentences: call it K. (We’ve been tacitly acting as if the cardinality is countable this whole time, but that doesn’t actually matter.) What’s the cardinality of the set of all truth assignments?

Well, each truth assignment is a function from all sentences to {True, False}. And there are 2K such assignments. 2K is strictly larger than K, so there are more possible worlds than there are sentences. Now, the cardinality of the set of sets of sentences is also 2K. But the set of SETS of truth of assignments is 22^K!

What this means is that we can’t map sets of sentences onto sets of truth assignments without leaving some things out! This proof carries over to predicate logic as well. The language for both propositional and predicate logic is unable to express all sets of possible worlds corresponding to that language!

I love this result. It’s the first hint in mathematical logic that syntax and semantics can come apart.

That result is the climax of this post. What I want to do with the rest of this post is to actually give an explicit example of a set of truth assignments that are “indescribable” by any set of sentences, and to prove it. Warning: If you want to read on, things will get a bit more technical from here.

Alright, so we’ll use a shortcut to denote truth assignments. A truth assignment will be written as a string of “T”s and “F”s, where the nth character corresponds to how the truth assignment evaluates Pn. So the all-true truth assignment will just be written “TTTTTT…” and the all-false truth assignment will be written “FFFFF…”. The truth assignment corresponding to P1 being false and everything else true will be written “FTTTTT…”. And so on.

Now, here’s our un-describable set of truth assignments. {“FFFFFF…”, “TFFFFF…”, “TTFFFF…”, “TTTFFF…”, …}. Formally, define Vn to be the truth assignment that assigns “True” to every atomic proposition up to and including Pn, and “False” to all others. Now our set of truth assignments is just {Vn | n ∈ ℕ}.

Let’s prove that no set of sentences uniquely picks out this set of truth assignments. We prove by contradiction. Suppose that we could find a set of sentences that uniquely pick out these truth assignments and none other. Let’s call this set A. Construct a new set of sentences A’ by appending all atomic propositions A: A’ = A ∪ {P1, P2, P3, …}.

Is there any truth assignment that is consistent with all of A’? Well, we can answer this by using the Compactness Theorem: A’ has a truth assignment if and only if every finite subset of A’ has a truth assignment. But every finite subset of A’ involves sentences from A (which are consistent with Vn for each n by assumption), and a finite number of atomic propositions. Since each finite subset of A’ is only asserting the truth of a finite number of atomic sentences, we can always find a truth assignment Vk in our set that is consistent with it, by choosing one that switches to “False” long after the last atomic proposition that is asserted by our finite subset.

This means that each finite subset of A’ is consistent with at least one of our truth assignments, which means that A’ is consistent with at least one of our truth assignments. But A’ involves the assertion that all atomic propositions are true! The only truth assignment that is consistent with this assertion is the all-true assignment! And is that truth assignment in our set? No! And there we have it, we’ve reached our contradiction!

We cannot actually describe a set of possible worlds in which either all atomic propositions are false, or only the first is true, or only the first two are true, or only the first three are true, and so on forever. But this might prompt the question: didn’t you just describe it? How did you do that, if it’s impossible? Well, technically I didn’t describe it. I just described the first four possibilities and then said “and so on forever”, assuming that you knew what I meant. To have actually fully pinned down this set of possible worlds, I would have had to continue with this sentence forever. And importantly, since this sentence is a disjunction, I could not split this infinite sentence into an infinite set of finite sentences. This fundamental asymmetry between ∨ and ∧ is playing a big role here: while an infinite conjunction can be constructed by simply putting each clause in the conjunction as a separate sentence, an infinite disjunction cannot be. This places a fundamental limit on the ability of a language with only finite sentences to describe the world.

# The Surprise-Response Heuristic

Often we judge if somebody else is understanding something that we do not understand by whether the things they say in response to our questions are surprising.

When somebody understands it about as well as you do, the things they say about it will generally be fairly understandable and expected (as they mesh with your current insufficient level of understanding). But if they actually understand it and you don’t, then you should expect to be surprised by the things they say, since you couldn’t have produced those responses yourself or predicted them coming.

Compare:

Q: “In this step of the proof, are they talking about extending the model or the language?”

A1: “They’re talking about extending the model. Look at the way that they worded the description of the extension in the previous step, it specifically describes adding a character to the model, not the language.”

A2: “No, they couldn’t be extending the model even though the wording suggests that, because then the proof wouldn’t even work; it’s required that we just change the language or else we end up working with a different model and failing to prove that the original model had the desired property. Also, it doesn’t even make sense to talk about adding a character to a model, the characters are a property of the language.”

Even with no context to judge whether the claims are true, I imagine that the second response feels much more convincing than the first, even though it’s probably less likely to be understood. The first is the type of response that is unsurprising and easy to see coming, and indicates only that the person is understanding the grammar of the English sentences they’re reading. It doesn’t strongly discriminate between a person that understands what’s going on and a person that doesn’t. The second is certainly surprising; it suggests that the person objects to the specific wording of the proof because of their understanding of the way it misrepresents the logical structure of the argument. They aren’t just comprehending the grammar, they are comprehending the actual content. Ordinarily, a person wouldn’t be able to off-the-cuff make up a response like that without actually understanding what’s going on.

This is a problem when people are good at saying surprising things without understanding. I’ve met a few people that are very good “contrarians”; they are good at coming up with strange and creative ways to say things that ultimately shed very little insight on the topic at hand. I often found myself in a weird position with such people where I feel like they understand the topic at hand better than me, and yet simultaneously I’m deeply suspicious of every word coming out of their mouth.

# There’s a problem with infinity

Last post I described the Ross-Littlewood paradox, in which an ever-expanding quantity of numbered billiard balls are placed into a cardboard box in such a way that after an infinite number of steps the box ends up empty. Here’s a version of this paradox:

Process 1
Step 1: Put 1 through 9 into the box.
Step 2: Take out 1, then put 10 through 19 into the box.
Step 3: Take out 2, then put 20 through 29 into the box.
Step 4: Take out 3, then put 30 through 39 into the box.
And so on.

Box contents after each step
Step 1: 1 through 9
Step 2: 2 through 19
Step 3: 3 through 29
Step 4: 4 through 39
And so on.

Now take a look at a similar process, where instead of removing balls from the box, we just change the number that labels them (so, for example, we paint a 0 after the 1 to turn “Ball 1” to “Ball 10″).

Process 2
Step 1: Put 1 through 9 into the box
Step 2: Change 1 to 10, then put 11 through 19 into the box.
Step 3: Change 2 to 20, then put 21 through 29 in.
Step 3: Change 3 to 30, then put 31 through 39 in.
And so on.

Box contents after each step
Step 1: 1 through 9
Step 2: 2 through 19
Step 3: 3 through 29
Step 4: 4 through 39
And so on.

Notice that the box contents are identical after each step. If that’s all that you are looking at (and you are not looking at what the person is doing during the step), then the two processes are indistinguishable. And yet, Process 1 ends with an empty box, and Process 2 ends with infinitely many balls in the box!

Why does Process 2 end with an infinite number of balls in it, you ask?

Process 2 ends with infinitely many balls in the box, because no balls are ever taken out. 1 becomes 10, which later becomes 100 becomes 1000, and so on forever. At infinity you have all the natural numbers, but with each one appended an infinite number of zeros.

So apparently the method you use matters, even when two methods provably get you identical results! There’s some sort of epistemic independence principle being violated here. The outputs of an agent’s actions should be all that matters, not the specific way in which the agent obtains those outputs! Something like that.

Somebody might respond to this: “But the outputs of the actions aren’t the same! In Process 1, each step ten are added and one removed, whereas in Process 2, each step nine are added. This is the same with respect to the box, but not with respect to the rest of the universe! After all, those balls being removed in Process 1 have to go somewhere. So somewhere in the universe there’s going to be a big pile of discarded balls, which will not be there in Process 2.

This responds holds water as long as our fictional universe doesn’t violate conservation of information, as if not, these balls can just vanish into thin air, leaving no trace of their existence. But that rebuttal feels cheap. Instead, let’s consider another variant that gets at the same underlying problem of “relevance of things that should be irrelevant”, but avoids this problem.

Process 1 (same as before)
Step 1: Put 1 through 9 into the box.
Step 2: Take out 1, then put 10 through 19 into the box.
Step 3: Take out 2, then put 20 through 29 into the box.
Step 4: Take out 3, then put 30 through 39 into the box.
And so on.

Box contents after each step
Step 1: 1 through 9
Step 2: 2 through 19
Step 3: 3 through 29
Step 4: 4 through 39
And so on.

And…

Process 3
Step 1: Put 1 through 9 into the box.
Step 2: Take out 9, then put 10 through 19 into the box.
Step 3: Take out 19, then put 20 through 29 into the box.
Step 4: Take out 29, then put 30 through 39 into the box.
And so on.

Box contents after each step
Step 1: 1 through 9
Step 2: 1 to 8, 10 to 19
Step 3: 1 to 8, 10 to 18, 20 to 29
Step 4: 1 to 8, 10 to 18, 20 to 28, 30 to 39
And so on

Okay, so as I’ve written it, the contents of each box after each step are different in Processes 1 and 3. Just one last thing we need to do: erase the labels on the balls. The labels will now just be stored safely inside our minds as we look over the balls, which will be indistinguishable from one another except in their positions.

Now we have two processes that look identical at each step with respect to the box, AND with respect to the external world. And yet, the second process ends with an infinite number of balls in the box, and the first with none! (Every number that’s not one less than a multiple of ten will be in there.) It appears that you have to admit that the means used to obtain an end really do matter.

But it’s worse than this. You can arrange things so that you can’t tell any difference between the two processes, even when observing exactly what happens in each step. How? Well, if the labelling is all in your heads, then you can switch around the labels you’ve applied without doing any harm to the logic of the thought experiment. So let’s rewrite Process 3, but fill in both the order of the balls in the box and the mental labelling being used:

Process 3
1 2 3 4 5 6 7 8 9
Mentally rotate labels to the right:
9 1 2 3 4 5 6 7 8
Remove the furthest left ball:
1 2 3 4 5 6 7 8
Add the next ten balls to the right in increasing order:
1 2 3 4 5 6 7 8 10 11 12 13 14 15 16 17 18 19
Repeat!

Compare this to Process 1, supposing that it’s done without any relabelling:

Process 1
1 2 3 4 5 6 7 8 9
Remove the furthest left ball:
2 3 4 5 6 7 8 9
Add the next tell balls to the right in increasing order:
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 19
Repeat!

If the labels are all in your head, then these two processes are literally identical except for how a human being is thinking about them.

But looking at Process 3, you can prove that after Step 1 there will always be a ball labelled 1 in the box. Same with 2, 3, 4, and all other numbers that are not a multiple of 10 minus one. Even though we remove an infinity of balls, there are ball numbers that are never removed. And if we look at the pile of discarded balls, we’ll see that it consists of 9, 19, 29, 39, and so on, but none of the others. Unless some ball numbers vanish in the process (which they never do!), all the remainders must still be sitting in the box!

So we have two identical-in-every-relevant-way processes, one of which ends with an infinite number of balls in the box and the other with zero. Do you find this troubling? I find this very troubling. If we add some basic assumption that an objective reality exists independent of our thoughts about it, then we’ve obtained a straightforward contradiction.

✯✯✯

Notice that it’s not enough to say “Well, in our universe this process could never be completed.” This is for two reasons:

First of all, it’s actually not obvious that supertasks (tasks involving the completion of an infinite number of steps in a finite amount of time) cannot be performed in our universe. In fact, if space and time are continuous, then every time you wave your hand you are completing a sort of supertask.

You can even construct fairly physically plausible versions of some of the famous paradoxical supertasks. Take the light bulb that blinks on and off at intervals that get shorter and shorter, such that after some finite duration it has blinked an infinity of times. We can’t say that the bulb is on at the end (as that would seem to imply that the sequence 0101010… had a last number) or that it is off (for much the same reason). But these are the only two allowed states of the bulb! (Assume the bulb is robust against bursting and all the other clever ways you can distract from the point of the thought experiment.)

Now, here’s a variant that seems fairly physically reasonable:

A ball is dropped onto a conductive plate that is attached by wire to a light bulb. The ball is also wired to the bulb, so that when the ball contacts the plate, a circuit is completed that switches the light bulb on. Each bounce, the ball loses some energy to friction, cutting its velocity exactly in half. This means that after each bounce, the ball hangs in the air for half as long as it did the previous bounce. Suppose the time between the first and second bounce was 1 second. Then the time between the second and third will be .5 seconds. And next will be .25 seconds. And so on. At 2 seconds, the ball will have bounced an infinite number of times. So at 2 seconds, the light bulb will have switched on and off an infinite number of times.

And of course, at 2 seconds the ball is at rest on the plate, completing the circuit. So at 2 seconds, upon the completion of the supertask, the light will be on.

Notice that there are no infinite velocities here, or infinite quantities of energy. Just ordinary classical mechanics applied to a bouncing ball and a light bulb. What about infinite accelerations? Well even that is not strictly speaking necessary; we just imagine that each velocity reversal takes some amount of time, which shrinks to zero as the velocity shrinks to zero in such a way as to keep all accelerations finite and sum to a finite total duration.

All this is just to say that we shouldn’t be too hasty in dismissing the real-world possibility of apparently paradoxical supertasks.

But secondly, and more importantly, physical possibility is not the appropriate barometer of whether we should take a thought experiment seriously. Don’t be the person that argues that the fat man wouldn’t be sufficient to stop a trolley’s momentum. When we find that some intuitive conceptual assumptions lead us into trouble, the takeaway is that we need to closely examine and potentially revise our concepts!

Think about Russell’s paradox, which showed that some of our most central intuitions about the concept of a set lead us to contradiction. Whether or not the sets that Bertie was discussing can be pointed to in the physical world is completely immaterial to the argument. Thinking otherwise would have slowed down progress in axiomatic set theory immensely!

These thought experiments are a problem if you believe that it is logically possible for there to be a physical universe in which these setups are instantiated. That’s apparently all that’s required to get a paradox, not that the universe we live in happens to be that one.

So it appears that we have to conclude some limited step in the direction of finitism, in which we rule out a priori the possibility of a universe that allows these types of supertasks. I’m quite uncomfortable with this conclusion, for what it’s worth, but I don’t currently see a better option.

## The Puzzle

You have in front of you an empty box. You also have on hand an infinite source of billiard balls, numbered 0, 1, 2, 3, 4, and so on forever.

At time zero, you place balls 0 and 1 in the box.

In thirty minutes, you remove ball 0 from the box, and place in two new balls (2 and 3).

Fifteen minutes after that, you remove ball 1 from the box, and place in two new balls (4 and 5).

7.5 minutes after that, you remove ball 2 and place in balls 6 and 7.

And so on. After an hour, you will have taken an infinite number of steps. How many billiard balls will be in the box?

✯✯✯

At time zero, the box contains two balls (0 and 1). After thirty minutes, it contains three (1, 2, and 3). After 45 minutes, it contains four (2, 3, 4, and 5). You can see where this is going…

Naively taking the limit of this process, we arrive at the conclusion that the box will contain an infinity of balls.

But hold on. Ask yourself the following question: If you think that the box contains an infinity of balls, name one ball that’s in there. Go ahead! Give me a single number such that at the end of this process, the ball with that number is sitting in the box.

The problem is that you cannot do this. Every single ball that is put in at some step is removed at some later step. So for any number you tell me, I can point you to the exact time at which that ball was removed from the box, never to be returned to it!

But if any ball that you can name can be proven to not be in the box.. and every ball you put in there was named… then there must be zero balls in the box at the end!

In other words, as time passes and you get closer and closer to the one-hour mark, the number of balls in the box appears to be growing, more and more quickly each moment, until you hit the one-hour mark. At that exact moment, the box suddenly becomes completely empty. Spooky, right??

Let’s make it weirder.

What if at each step, you didn’t just put in two new balls, but one MILLION? So you start out at time zero by putting balls 0, 1, 2, 3, and so on up to 1 million into the empty box. After thirty minutes, you take out ball 1, but replace it with the next 1 million numbered balls. And at the 45-minute mark, you take out ball 2 and add the next 1 million.

What’ll happen now?

Well, the exact same argument we gave initially applies here! Any ball that is put in the box at any point, is also removed at a later point. So you literally cannot name any ball that will still be in the box after the hour is up, because there are no balls left in the box! The magic of infinity doesn’t care about how many more balls you’ve put in than removed at any given time, it still delivers you an empty box at the end!

Now, here’s a final variant. What if, instead of removing the smallest numbered ball each step, you removed the largest numbered ball?

So, for instance, at the beginning you put in balls 0 and 1. Then at thirty minutes you take out ball 1, and put in balls 2 and 3. At 45 minutes, you take out ball 3, and put in balls 4 and 5. And so on, until you hit the one hour mark. Now how many balls are there in the box? Infinity! Why not zero like before? Well, because now I can name you an infinity of numbers whose billiard balls are still guaranteed to be in the box when the hour’s up. Namely, 0, 2, 4, 6, and all the other even numbered balls are still going to be in there.

Take a moment to reflect on how bizarre this is. We removed the exact same number of balls each step as we did last time. All that changed is the label on the balls we removed! We could even imagine taking off all the labels so that all we have are identical plain billiard balls, and just labeling them purely in our minds. Now apparently the choice of whether to mentally label the balls in increasing or decreasing order will determine whether at the end of the hour the box is empty or packed infinitely full. What?!? It’s stuff like this that makes me sympathize with ultrafinitists.

One final twist: what happens if the ball that we remove each step is determined randomly? Then how many balls will there be once the hour is up? I’ll leave it to you all to puzzle over!

# Are the Busy Beaver numbers independent of mathematics?

A few years ago, Scott Aaronson and a student of his published this paper, in which they demonstrate the existence of a 7918 state Turing machine whose behavior is independent of ZFC. In particular, whether the machine halts or not can not be proven by ZFC. This entails that ZFC cannot prove the value of BB(7918) – the number of steps taken by the longest running Turing machine with 7918 states before halting. And since ZFC is a first order theory and first order logic is complete, the unprovability of the value entails that BB(7918) actually has different values in different models of the axioms! So ZFC does not semantically entail its value, which is to say that ZFC underdetermines the Busy Beaver numbers!

This might sound really surprising. After all, the Busy Beaver numbers are a well-defined sequence. There are a finite number of N-state Turing machines, some subset of which are finitely-running. Just look at the number of steps that the longest-running of these goes for, and that’s BB(N). It’s one thing to say that this value is impossible to prove, but what could it mean for this value to be underdetermined by the standard axioms of math? Are there some valid versions of math in which this machine runs for different amounts of time than others? But how could this be? Couldn’t we in principle build the Turing machine in the real world and just observe exactly how long it runs for? And if we did this, then we certainly shouldn’t expect to get an indeterminate answer. So what gives?

Well, first of all, the existence of a machine like Aaronson and Yedidia’s is actually not a surprise. For any consistent theory T whose axioms are recursively enumerable, one can build a Turing machine M that enumerates all the syntactic consequences of the axioms and halts if it ever finds a contradiction. That is, M simply starts with the axioms, and repeatedly applies modus ponens and the other inference rules of T’s logic until it reaches a contradiction. Now, if T is strong enough to talk about the natural numbers, then it cannot prove whether or not M halts. This is a result of Gödel’s Second Incompleteness Theorem: If T could prove the behavior of M, then it could prove its own consistency, which would entail that it is inconsistent. This means that no consistent formal theory will be capable of proving all the values of the Busy Beaver numbers; for any theory T there will always be some number N for which the value of BB(N) is in principle impossible to derive from T.

On the other hand, this does not entail that the Busy Beaver numbers do not have definite values. This misconception arises from two confusions: (1) independence and unprovability are not the same thing, and (2) independence does not necessarily mean that there is no single right answer.

On (1): A proposition P is independent of T if there are models of T in which P is true and other models in which it is false. P is unprovable from T if… well, if it can’t be proved from the axioms of T. Notice that independence is a semantic concept (having to do with the different models of a theory), while unprovability is a syntactic one (having only to do with what you can prove using the rules of syntax in T). Those two are equivalent in first order logic, but only because it’s a complete logic: Anything that’s true in all models of a first-order theory is provable from its axioms, so if you can’t prove P from T’s axioms, then P cannot be true in all models; i.e. P is independent. Said another way, first-order theories’ semantic consequences are all also syntactic consequences.

But this is not so in second-order logic! In a second-order theory T, X can be unprovable from T but still true in all models of T. There is a gap between the semantic and the syntactic, and therefore there is a corresponding gap between independence and unprovability.

So while it’s true that the Busy Beaver numbers are independent of any first-order theory you choose, it’s not true that the Busy Beaver numbers are independent of any second-order theory that you choose. We can perfectly well believe that all the Busy Beaver numbers have unique values, which are fixed by some set of second-order axioms, and we just cannot derive the values from these axioms.

And on (2): Even the independence of the Busy Beaver numbers from any first order theory is not necessarily so troubling. We can just say that the Busy Beaver numbers do have unambiguous values, it’s just that due to first-order logic’s expressive limitations, we cannot pin down exactly the model that we want.

In other words, if BB(7918) is 𝑥 in one model and 𝑥+1 in another, this does not have to mean that there is some deep ambiguity in the value of BB(7918). It’s just that only one of the models of your theory is the intended model, the one that’s actually talking about busy beaver numbers and Turing machines, and the other models are talking about some warped version of these concepts.

Maybe this sounds a little fishy to you. How do we know which model is the “correct” one if we can’t ever rule out its competitors?

Well, the inability of first order logic to get rid of these nonstandard models is actually  basic feature of pretty much any mathematical theory. In first-order Peano arithmetic, for instance, we find that we cannot rule out models that contain an uncountable number of “natural numbers”. But we don’t then say that we do not know for sure whether or not there are an uncountable infinity of natural numbers. And we certainly don’t say that the cardinality of the set of natural numbers is ambiguous! We just say that unfortunately, first order logic is unable to rule out those uncountable models that don’t actually represent natural numbers.

If this is an acceptable response here, if you find it tempting to say that the inability of first order theories of arithmetic to pin down the cardinality of the naturals tells us nothing whatsoever about the natural numbers’ actual cardinality, then it should be equally acceptable to say of the Busy Beaver numbers that the independence of their values from any given mathematical theory tells us nothing about their actual values!

# Introduction to Mathematical Logic (Part 3: Reconciling Gödel’s Completeness And Incompleteness Theorems)

Last time we saw that no first-order theory could successfully pin down the natural numbers as a unique model. We also saw that in fact no sound and complete theory with the property of finite provability could pick out ℕ; any such theory has a Compactness Theorem, from which you can prove the existence of nonstandard models of numbers that include infinities of numbers larger than any of the naturals.

Worse, we saw that due to the Löwenheim-Skolem theorem, any first-order theory of the naturals has uncountable models, as well as models of every infinite cardinality. What followed from this was that try as we might, any first-order theory will fail to prove elementary facts about the natural numbers, like that there are a countable infinity of them or that no number is larger than 0, 1, 2, 3, and all the rest of the standard naturals.

Now, with the limitations in our ability to prove all true facts about the natural numbers firmly established, I want to talk about a theorem that seems to be in contradiction with this. This is the Completeness Theorem, proven by the same Kurt Gödel who later discovered the Incompleteness Theorems.

In a few words, the Completeness Theorem says that in any first-order theory, that which is semantically entailed is also syntactically entailed. Or, said differently, the Completeness Theorem says that a first order theory can prove everything that is true in all its models.

This sounds great! It’s the dream of mathematicians to be able to construct a framework in which all truths are provable, and Gödel appears to be telling us that first order logic is just such a framework.

But hold on, how do we reconcile this with what we just said: that there are true facts about the natural numbers that can’t be proven? And how do we reconcile this with what Gödel later proved: the Incompleteness Theorems? The First Incompleteness Theorem shows that that in any axiomatic system of mathematics, there will be true statements about arithmetic that cannot be proven within the system. These seem to be in stark contradiction, right? So which is right; Completeness or Incompleteness? Can we prove everything in first-order logic or not?

It turns out that there is no contradiction here. The Completeness Theorem is true of first order logic, as is the Incompleteness Theorem. What’s required is a subtle understanding of exactly what each theorem is saying to understand why the apparent conflict is only apparent.

There are two senses of completeness. One sense of completeness refers to theories in a particular logic. In this sense, a theory is complete if it can prove the truth or falsity of every well-formed formula. Gödel’s Incompleteness Theorems are about this sense of completeness: no theory rich enough to talk about the natural numbers can prove all statements about the natural numbers.

The second sense of completeness refers to logical frameworks themselves. A logical framework is complete if for every theory in that logic, all the semantically valid statements can be proven. Gödel’s Completeness Theorem is about this sense of completeness. In first-order logic, any theory’s semantic consequences are syntactic consequences.

Let’s go a little deeper into this.

What Gödel showed in his First Incompleteness Theorem is that one can encode the statements of any logical theory (first order or second order) as natural numbers. Then, if your theory is expressive enough to talk about the natural numbers, you produce a type of circularity: Your theory makes statements about the natural numbers, and the natural numbers encode statements from the theory. Gödel exploited this circularity to produce a sentence whose interpretation is something like “This sentence has no proof in T” (where T is your chosen theory). This sentence corresponds to some complicated fact about the properties of the natural numbers, because all sentences are encoded as natural numbers. If it’s false, then that means that the sentence does have a proof, so it’s true. And if it’s true, then it has no proof. The conclusion is that for any theory T, there are true statements about the natural numbers that cannot be proven by T.

The Incompleteness Theorem plays out slightly differently in first and second order logic. In first-order logic, anything that’s true in all models is provable. It’s just that there is no theory that has the natural numbers as a unique model. The best you can do in first-order logic is develop a theory that has the natural numbers as one out of many models. But this means that there is no first-order theory whose semantic consequences are all the truths about the natural numbers! (This is a neat way to actually prove the inevitability of nonstandard models of arithmetic, as a necessary consequence of the combination of the Completeness and Incompleteness theorems.)

And as it happens, any theory that has a model of the natural numbers is also going to have models in which Gödel’s statement is actually false! Remember that Gödel’s statement, if false, says that it is provable, which is to say that some number encodes a proof of it. In nonstandard models of arithmetic, there will be nonstandard numbers that encode a “proof” of Gödel’s statement, using Gödel’s encoding. It’s just that these “proofs” are not actually going to be things that we recognize as valid (for example, nonstandard numbers can represent infinitely long proofs). They’re proofs according to Gödel’s encoding, but Gödel’s encoding only represents valid proofs insofar as we assume that only natural numbers are being used in the encoding. So since Gödel’s statement is not true in all models of any first-order theory of arithmetic, it’s not provable from the axioms of the theory. And this is totally consistent with any first-order theory being complete! All that completeness means is that anything that is true in all models of a theory is also provable!

Now, how does this play out in second-order logic? Well, in second-order logic, you can uniquely pin down the natural numbers with a second-order theory. This is where the Incompleteness Theorem swoops in and tells you that the semantic consequences are not all syntactic consequences. Since there are second-order theories whose semantic consequences are all the truths about the natural numbers, these theories must have as semantic consequences the truth of the Gödel statement, which by construction means that they cannot have the Gödel statement as a syntactic consequence.

So either way you go, first or second order, you’re kinda screwed. In first-order, you can prove all the semantic consequences of a theory. It’s just that no theory has the full set of semantic consequences that we want, because first-order logic doesn’t have the expressive power to pin down the types of mathematical structures that we are interested in. And in second-order theories, you do have the expressive power to pin down the types of mathematical structures we care about, but as a consequence lose the property that all semantic consequences can be proved.

Regardless, there are fundamental limitations in our ability to prove all the facts that we want to prove as mathematicians. Hilbert’s program has crumbled. You can choose a theory that guarantees you the ability to prove all its semantic consequences, but lacks the expressive power to uniquely pin down the natural numbers. Or you can choose a theory that has the expressive power to uniquely pin down the natural numbers, but guarantees you the inability to prove all its semantic consequences.

# Introduction to Mathematical Logic (Part 1)

Mathematical logic is the study of the type of reasoning we perform when we do mathematics, and the attempt to formulate a general language as the setting in which all mathematics is done. In essence, it is an attempt to form a branch of mathematics, of which all other branches of mathematics will emerge as special cases.

You might sort of think that when speaking at this level of abstraction, there will nothing general and interesting to say. After all, we’re trying to prove statements not within a particular domain of mathematics, but theorems that are true across a wide swath of mathematics, potentially encompassing all of it.

The surprising and amazing thing is that this is not the case. It turns out that there are VERY general and VERY surprising things you can discover by looking at the logical language of mathematics, a host of results going by names like the Completeness Theorem, the Incompleteness Theorem, the Compactness Theorem, the Löwenheim-Skolem Theorem, and so on. These results inevitably have a great deal of import to our attitudes towards the foundations of mathematics, being that they generally establish limitations or demonstrate eccentricities in the types of things that we can say in the language of mathematics.

My goal in this post is to provide a soft introduction to the art of dealing in mathematics at this level of ultimate abstraction, and then to present some of the most strange things that we know to be true. I think that this is a subject that’s sorely missing this type of soft introduction, and hope I can convey some of the subject’s awesomeness!

— — —

To start out with, why think that there is any subject matter to be explored here? Different branches of mathematics sometimes appear to be studying completely different types of structures. I remember an anecdote from an old math professor of mine, who worked within one very narrow and precisely defined area of number theory, and who told me that when she goes to talks that step even slightly outside her area of specialty, the content of the  lectures quickly incomprehensible to her. Why think that there is such a common language of mathematics, if specialists in mathematics can’t even understand each other when talking between fields?

The key thing to notice here is that although different fields of mathematics are certainly wildly different in many ways, there nevertheless remain certain fundamental features that are shared in all fields. Group theorists, geometrists, and number theorists will all accept the logical inference rule of modus ponens (if P is true and P implies Q, then Q is true), but none of them will accept its converse (if Q is true and P implies Q, then P is true). No matter what area of mathematics you study, you will accept that if P(x) is true for all x, then it is true for any particular x you choose. And so on. These similarities may seem obvious and trivial, but they HAVE to be obvious and trivial to be things that every mathematician agrees on. The goal, then, is to formalize a language that has these fundamental inference rules and concepts built in, and that has many special cases to account for the differences between domains of math, specified by some parameters that are freely chosen by any user of the system.

There are actually several distinct systems that attempt to accomplish this task. Generally speaking, there are three main branches, in order of increasing expressive power: propositional logic, first order (predicate) logic, and second order logic.

## Reasoning In Zeroth Order

Let’s start with propositional logic, sometimes called “zeroth order logic.” Propositional logic is the framework developed to deal with the validity of the following types of arguments:

Argument 1

1. 2+2=4.
2. If 2+2=4, then 1+3=4.
3. So 1+3=4.

Argument 2

1. The Riemann hypothesis is false and P = NP.
2. So P = NP.

Notice that it doesn’t matter if our premises are true or not. Logical validity doesn’t care about this, it just cares that the conclusions really do follow from the premises. This is a sign of the great generality at which we’re speaking. We’re perfectly fine with talking about a mathematical system in which the Riemann hypothesis is false, or in which 2+2 is not 4, just so long as we accept the logical implications of our assumptions.

Propositional logic can express the validity of these arguments by formalizing rules about valid uses of the concepts ‘and’, ‘if…then…’, ‘or’, and so on. It remains agnostic to the subject matter being discussed by not fully specifying the types of sentences that are allowed to be used in the language. Instead, any particular user of the language can choose their set of propositions that they want to speak about.

To flesh this out more, propositional logic fulfills the following three roles:

1. Defines an alphabet of symbols.
2. Specifies a set of rules for which strings are grammatical and which are not.
3. Details rules for how to infer new strings from existing strings.

In more detail:

1. The set of symbols in propositional logic are split into two categories: logical symbols and what I’ll call “fill-in-the-blank” symbols. The logical symbols are (, ), , , ¬, and →. The fill-in-the-blank symbols represent specific propositions, that are specified by any particular user of the logic.
2. Some strings are sensible and others not. For example, the string “P∧∧” will be considered to be nonsensical, while “PQ” will not. Some synonyms for sensible strings are well-formed formulas (WFFs), grammatical sentences, and truth-apt sentences. There is a nice way to inductively generate the set of all WFFs: Any proposition is a WFF, and for any two WFFs F and F’, the following are also WFFs: (FF’), (FF’), ¬F, (F→F’).
3. These include rules like modus ponens (from P and P→Q, derive Q), conjunction elimination (from PQ, derive P), double negation elimination (from ¬¬P, derive P), and several more. They are mechanical rules that tell you how to start with one set of strings and generate new ones in a logically valid way, such that if the starting strings are true than the derived ones must also be true. There are several different but equivalent formulations of the rules of inference in propositional logic.

A propositional language fills in the blanks in the logic. Say that I want to talk about two sentences using propositional logic: “The alarm is going off”, “A robber has broken in.” For conciseness, we’ll abbreviate these two propositions as A for alarm and R for robber. All I’ll do to specify my language is to say “I have two propositions: {A, R}”

The next step is to fill in some of the details about the relationships between the propositions in my language. This is done by supplementing the language with a set of axioms, and we call the resulting constrained structure a propositional theory. For instance, in my example above we might add the following axioms:

1. A→R
2. AR

In plain English, these axioms tell us that (1) if the alarm is going off, then a robber has broken in, and (2) an alarm is going off or a robber has broken in.

Finally, we talk about the models of our theory. Notice that up until now, we haven’t talked at all about whether any statements are true or false, just about syntactic properties like “The string P¬ is not grammatical” and about what strings follow from each other. Now we interpret our theory by seeing what possible assignments of truth values to the WFFs in our language are consistent with our axioms and logical inference rules. In our above example, there are exactly two interpretations:

Model 1: A is true, R is true
Model 2: A is false, R is true

These models can be thought of as the possible worlds that are consistent with our axioms. In one of them, the alarm has gone off and a robber has broken in, and in the other, the alarm hasn’t gone off and the robber has broken in.

Notice that R turns out true in both models. When a formula F is true in all models of a theory, we say that the theory semantically entails F, and write this as T F. When a formula can be proven from the axioms of the theory using the rules of inference given by the logic, then we say that the theory syntactically entails F, and write this as T F.

This distinction between syntax and semantics is really important, and will come back to us in later discussion of several important theorems (notably the completeness and incompleteness theorems). To give a sneak peek: above we found that R was semantically entailed by our theory. If you’re a little familiar with propositional logic, you might have also realized that R can be proven from the axioms. In general, syntactic truths will always be semantic truths (if you can prove something, then it must be true in all models, or else the models would be inconsistent. But a model is by definition a consistent assignment of truth values to all WFFs). But a natural question is: are all semantic consequences of a theory also syntactic consequences? That is, are all universal truths of the theory (things that are true in every model of the theory) provable from the theory? If the answer is yes, then we say that our logic is complete. And it turns out that the answer is yes, for propositional logic. Whether more complex logics are complete turns out to be a more interesting question. More on this later.

This four-step process I just laid out (logic to language to theory to model) is a general pattern we’ll see over and over again. In general, we have the following division of labor between the four concepts:

1. Logic: The logic tells us the symbols we may use (including some fill-in-the-blank categories of symbols), the rules of grammar, and a set of inference rules for deriving new strings from an existing set.
2. Language: The language fills in the blanks in our logic, fully specifying the set of symbols we will be using.
3. Theory: The theory adds axioms to the language.
4. Model: A model is an assignment of truth values to all WFFs in the language, consistent with the axioms and the inference rules of our logic.

It’s about time that we apply this four-step division to a more powerful logic. Propositional logic is pretty weak. Not much interesting math can be done in a purely propositional language, and it’s wildly insufficient to capture our notion of logically valid reasoning. Consider, for example, the following argument:

1. Socrates is a man.
2. All men are mortal.
3. So, Socrates is mortal.

This is definitely a valid argument. No rational agent could agree that 1 and 2 are true, and yet deny the truth of 3. But can we represent the validity of this argument in propositional logic? No!

Consider that the three sentences “Socrates is a man”, “All men are mortal”, and “Socrates is mortal” are distinct propositions, and the relationships between them are too subtle for propositional logic to capture. Propositional logic can’t see that the first proposition is asserting the membership of Socrates to a general class of things (“men”), and that the second proposition is then making a statement about a universal property of things in this class. It just sees two distinct propositions. To propositional logic, this argument just looks like

1. P
2. Q
3. Therefore, R

But this is not logically valid! We could make it valid by adding as a premise the sentence (PQ)→R, which corresponds to the English sentence “If Socrates is a man and all men are mortal, then Socrates is mortal.” But this should be seen as a tautology, something that is provable in any first order theory that contains the propositions P Q and R, not a required additional assumption. Worse, if somebody came along and stated the proposition A = “Aristotle is a man”, then we would need a whole ‘nother assumption to assert that Aristotle is also mortal! And in general, for any individual instance of this argument, we’d need an independent explanation for its validity. This is not parsimonious, and indicative that propositional logic is missing something big.

Missing what? To understand why this argument is valid, you must be able to reason about objects, properties, and quantification. This is why we must move on to an enormously more powerful and interesting logic: first order logic.

## Reasoning In First Order

First order logic is a logic, so it must fill the same three roles as we saw propositional logic did above. Namely, it must define the alphabet, the grammar, and the inference rules.

Symbols
Logical: ¬ → ( ) =
Variables: x y z w …
Constants: ______
Predicates: ______
Functions: ______

That’s right, in first order we have three distinct categories of fill-in-the-blank symbols. Intuitively, constants will be names that refer to objects, predicates will be functions from objects to truth values, and functions will take objects to objects. To take an everyday example, if the objects in consideration are people, then we might take ‘a’ to be a constant referring to a person named Alex, ’T’ to be a predicate representing ‘is tall’, and ‘f’ to be a function representing “the father of”. So T(a) is either true or false, while f(a) doesn’t have a truth value (it just refers to another object). But T(f(a)) does have a truth value, because it represents the sentence “Alex’s father is tall.”

Next, we define well-formed formulas. This process is more complicated than it was for propositional logic, because we have more types of things than we did before, but it’s not too bad. We start by defining a “term”. The set of terms is inductively generated by the following scheme: All constants and variables are terms, and any function of terms is itself a term. Intuitively, the set of terms is the set of objects that our language is able to “point at”.

With the concept of terms in hand, we can define WFFs through a similar inductive scheme: Any predicate of terms is a WFF. And for any WFFs F and F’, (FF’), (FF’), (¬F), (F→F’), x F, x F are all WFFs. The details of this construction are not actually that important, I just think it’s nice how you can generate all valid first order formulas from fairly simple rules.

Good! We have an alphabet, a grammar, and now all we need from our logic is a set of inference rules. It turns out that this set is just going to be the inference rules from propositional logic, plus some new ones:

Quantifier elimination: From x P(x) derive P(t) (for any term t and predicate P)
Quantifier introduction: From P(t) derive x P(x) (for any term t and predicate P)

That’s it, we’ve defined first order logic! Now let’s talk about a first order language. Just like before, the language is just obtained by filling in the blanks left open by our logic. So for instance, we might choose the following language:

Constants: a
Functions: f
Predicates: none

In specifying our function, we have to say exactly what type of function it is. Functions can take as inputs a single object (“the father of”) or multiple objects (“the nearest common ancestor of”), and this will make a difference to how they are treated. So for simplicity, let’s say that our function f just takes in a single object.

A first order theory will simply be a language equipped with some set of axioms. Using our language above, we might have as axioms:

1. x (f(x) ≠ x)
2. x (f(x) ≠ a)

In plain English, we’re saying that f never takes any object to itself or to a.

And lastly, we get to the models of a first order theory. There’s an interesting difference between models here and models in propositional logic, which is that to specify a first order model, you need to first decide on the size of your set of objects (the size of the ‘universe’, as it’s usually called), and then find a consistent assignment of truth values to all propositions about objects in this universe.

So, for instance, we can start searching for models of our theory above by starting with models with one object, then two objects, then three, and so on. We’ll draw little diagrams below in which points represent objects, and the arrow represents the action of the function f on an object.

1. No 1-element universe. 2. No 2-element universe. 3. 3-element universe 4. 4 element universes    5. And so on…

It’s a fun little exercise to go through these cases and see if you can figure out why there are no models of size 1 or 2, or why the one above is the only model of size 3. Notice, by the way, that in the last two images, we have an object for which there is no explicit term! We can’t get there just using our constant and our functions. Of course, in this case we can still be clever with our quantifiers to talk about the Nameless One indirectly (for instance, in both of these models we can refer to the object that is not equal to a, f(a), or f(f(a))) But in general, the set of all things in the universe is not going to be the same as the set of things in the universe that we can name.

Here’s a puzzle for you: Can you make a first order sentence that says “I have exactly four objects”? That is, can you craft a sentence that, if added as an axiom to our theory, will rule out all models besides the ones that have exactly four elements?

(…)

(Think about it for a moment before moving on…)

(…)

Here’s how to do it for a universe with two objects (as well as a few other statements of interest). The case of four objects follows pretty quickly from this.

• “I have at most two objects” = xyz (z=x z=y)
• “I have exactly two objects” = xy (x≠y z(z=x z=y))
• “I have at least two objects” = xy (x≠y)

Another puzzle: How would you say “I have an infinity of objects”, ruling out all finite models?

(…)

(Think about it for a moment before moving on…)

(…)

This one is trickier. One way to do it is to introduce an infinite axiom schema: “I have at least n objects” for each n.

This brings up an interesting point: theories with infinite axioms are perfectly permissible for us. This is a choice that we make that we could perfectly well deny, and end up with a different and weaker system of logic. How about infinitely long sentences? Are those allowed? No, not in any of the logics we’re talking about here. A logic in which infinite sentences are allowed is called an infinitary logic, and I don’t know too much about such systems (besides that propositional, first, and second order logics are not infinitary).

Okay… so how about infinitely long derivations? Are those allowed? No, we won’t allow those either. This one is more easy to justify, because if infinite derivations were allowed, then we could prove any statement P simply by the following argument “P, because P, because P, because P, because …”. Each step logically follows from the previous, and in any finite system we’d eventually bottom out and realize that the proof has no basis, but in a naive infinite-proof system, we couldn’t see this.

Alright, one last puzzle. How would you form the sentence “I have a finite number of objects”? I.e., suppose you want to rule out all infinite models but keep the finite ones. How can you do it?

(…)

(Think about it for a moment before moving on…)

(…)

This one is especially tricky, because it turns out to be impossible! (Sorry about that.) You can prove, and we will prove in a little bit, that no first order axiom (or even infinite axiom schema) is in general capable of restricting us to finite models. We have run up against our first interesting expressive limitation of first-order logic!

Okay, let’s now revisit the question of completeness that I brought up earlier. Remember, a logic is complete if for any theory in that logic, the theory’s semantic implications are the same as its syntactic implications (all necessary truths are provable). Do you think that first order logic is complete?

The answer: Yes! Kurt Gödel proved that it is in his 1929 doctoral dissertation. Anything that is true in all models of a first order theory, can be proven from the axioms of that theory (and vice versa). This is a really nice feature to have in a logic. It’s exactly the type of thing that David Hilbert was hoping would be true of mathematics in general. (“We must know. We will know!”) But this hope was dashed by the same Kurt Gödel as above, in his infamous incompleteness theorems.

There will be a lot more to say about that in the future, but I’ll stop this post for now. Next time, we’ll harness the power of first order logic to create a first order model of number theory! This will give us a chance to apply some powerful results in mathematical logic, and to discover surprising truths about the logic’s limitations.