A Gödelian Logic Puzzle

There’s an island on which there lives exactly two types of people: truthers and liars. Truthers always say true statements, and liars always say false statements. One day a brilliant logician comes to visit the island. The logician knows all of the above-stated facts about the island. It also happens that the logician is a perfectly sound reasoner – he never proves anything that is false.

The logician encounters an individual named ‘Jal’ that lives on the island. The logician knows that Jal lives on the island, and so is either a truther or a liar. Now, Jal makes a statement from which it logically follows that Jal is a truther. But the logician could never possibly prove that Jal is a truther! (Remember, we never asserted that the logician proves all true things, just that the logician proves only true things). What type of statement could accomplish this?

This puzzle is from a paper by Raymond Smullyan on mathematical logic. Try to answer it for yourself before reading on!

(…)

Alright, so here’s one possible answer. Jal could say to the logician: “You will never prove that I am a truther.” I claim that this sentence logically entails that Jal is a truther, and yet the logician cannot possibly prove it.

First of all, why does it entail that Jal is a truther? Let’s prove it by contradiction. Suppose that Jal is not a truther. Then, since Jal is either a truther or a liar, Jal must be a liar. That means that every statement Jal makes must be false. So in particular, Jal’s statement that “you will never prove that I am a truther” must be false. This entails that the logician must eventually prove that Jal is a truther. But we assumed that Jal isn’t a truther! So the logician must eventually prove a falsehood. But remember, we assumed that our logician’s proofs were always sound, so that he will never prove a falsehood. So we have a contradiction.

Therefore, Jal is a truther.

Now, why can the logician not prove that Jal is a truther? This can be seen very straightforwardly: we just proved that Jal is a truther, which means that all of Jal’s statements must be true. So in particular, Jal’s statement that “you will never prove that I am a truther” must be true. So in other words, it’s true that the logician will never prove that Jal is a truther!

So there you have it, a statement that appears to satisfy both of the criteria!

But now the next question I have for you is a bit trickier. It appears from the line of reasoning above that we have just proven that Jal is a truther. So why couldn’t the logician just run through that exact same line of reasoning? It appears to be perfectly valid, and to use nothing more advanced than basic predicate logic.

But if the logician does go through that line of reasoning, then he will conclude that Jal is a truther, which will make Jal’s statement false, which is a contradiction! So we’ve gone from something which was maybe just unintuitive to an actual paradox. Can you see how to resolve this paradox? (Again, see if you can figure it out yourself before reading on!)

(…)

Okay, so here’s the resolution. If we say that the logician can go through the same line of reasoning as us, then we reach a contradiction (that a truther tells a false statement). So we must deny that the logician can go through the same line of reasoning as us. But why not? As I said above, the reasoning is nothing more complicated than basic predicate logic. So it’s not that we’re using some magical supernatural rules of inference that no mortal logician could get his hands on. It must be that one of the assumptions we used in the argument is an assumption that the logician cannot use.

So look back through the argument, and carefully consider each of the assumptions we used:

First of all, why does it entail that Jal is a truther? Let’s prove it by contradiction. Suppose that Jal is not a truther. Then, since Jal is either a truther or a liar, Jal must be a liar. That means that every statement Jal makes must be false. So in particular, Jal’s statement that “you will never prove that I am a truther” must be false. This entails that the logician must eventually prove that Jal is a truther. But we assumed that Jal isn’t a truther! So the logician must eventually prove a falsehood. But remember, we assumed that our logician’s proofs were always sound, so that he will never prove a falsehood. So we have a contradiction.

In order, we made use of the assumptions that (1) Jal is either a truther or a liar, (2) every statement made by a liar is false, and (3) the logician is a sound reasoner.

I told you at the beginning that facts (1) through (2) are all known to the logician, but I did not say the same of (3)! The logician can only run through this argument if he knows that he is a sound reasoner (that he only proves true things). And this is the problem assumption, which must be rejected.

It’s not that no logician can actually ever be sound (a logician who only ever reasons in first order logic and nothing more fancy would be sound). It’s that the logician, though he really is sound, cannot know himself to be sound. In other words, no sound system can prove its own soundness!

This is very similar to Gödel’s second incompleteness theorem. The only proof system which can assert its own consistency is an inconsistent proof system, and the only type of logician that can prove his own soundness will end up being unsound. Here’s the argument that the logician might make if they believe in their own soundness:

Supposing Jal is a liar, then his statement is false, so I could eventually prove that he is a truther. But then I’d have proven something false, which I know I can never do, so Jal must not be a liar. So he must be a truther. 

Since the logician has now produced a proof that Jal is a truther, Jal’s statement is false. This means that Jal cannot be a truther, so the logician has proven a false statement!

Crazy conditionals

It’s well known that the material implication → of propositional logic does not do a perfect job of capturing what we mean when we make “if… then…” statements in English. The usual examples of failure rest on the fact that any material conditional with a false antecedent is vacuously true (so “if 2 is odd then 2 is even” turns out to be true). But over time, philosophers have come up with a whole lot of different ways in which → can catch us by surprise.

Here’s a list of some such cases. In each case, I will present an argument using if…then… statements that is clearly invalid, but which is actually valid in propositional logic if the if…then… statements are translated as the material conditional!

1. Harper

If I put sugar into my coffee, it will taste fine.
Therefore, if I put sugar and motor oil into my coffee, it will taste fine.

S → F
(S ∧ M) → F

2. Distributivity

If I pull both switch A and switch B, the engine will start.
Therefore, either the engine will start if I pull switch A or the engine will start if I pull switch B.

(A ∧ B) → S
(A → S) ∨ (B → S)

3. Transitivity

If Biden dies before the election, Trump will win.
If Trump wins the election, Biden will retire to his home.
Therefore, if Biden dies before the election, Biden will retire to his home.

B → T
T → R
B → R

4. Principle of Explosion

Either zombies will rise from the ground if I bury a chicken head in my backyard, or zombies will rise from the ground if I don’t bury a chicken head in my backyard.

(B → D) ∨ (¬B → D) is a tautology

5. Contraposition

If I buy a car, I won’t buy a Pontiac.
Therefore, if I buy a Pontiac, I won’t buy a car.

C → ¬P
P → ¬C

6. Simplification

If John is in London then he’s in England, and if he’s in Paris then he’s in France.
Therefore, either (1) if John’s in London he’s in France or (2) if John’s in Paris then he’s in England.

(L → E) ∧ (P → F)
(L → F) ∨ (P → E)

7. False Antecedent

It’s not the case that if God exists then human life is a product of random chance.
Therefore, God exists.

¬(G → C)
G

8. True Consequent

If I will have eternal life if I believe in God, then God must exist.
I do not believe in God.
Therefore, God exists.

(B → E) → G
~B
G

You can check for yourself that each of these is logically valid! Can you figure out what’s going wrong in each case?

A Dice Puzzle

Today I have a wonderfully counterintuitive puzzle to share!

You and a friend each throw a dice. Each of you can see how your own die landed, but not how your friend’s die landed. Each of you is to guess how the other’s die landed. If you both guess correctly, then you each get a reward. But if only one of you guesses correctly, neither of you get anything.

The two die rolls are independent and you are not allowed to communicate with your friend after the dice have been thrown, though you can coordinate beforehand. Given this, you would expect that you each have a 1 in 6 chance of guessing the other’s roll correctly, coming out to a total chance of 1 in 36 of getting the reward.

The question is: Is it possible to do any better?

Answer below, but only read on after thinking about it for yourself!

 

(…)

 

(…)

 

(Spoiler space)

 

(…)

 

(…)

 

The answer is that remarkably, yes, you can do better! In fact, you can get your chance of getting the reward as high as 1 in 6. This should seem totally crazy. You and your friend each have zero information about how the other die roll turned out. So certainly each of you has a 1 in 6 chance of guessing correctly. The only way for the chance of both guessing correctly to drop below 1 in 36 would be if the two guesses being correct were somehow dependent on each other. But the two die rolls are independent of one another, and no communication of any kind is allowed once the dice have been rolled! So from where does the dependence come? Sure you can coordinate beforehand, but it’s hard to imagine how this could help out.

It turns out that the coordination beforehand does in fact make a huge difference. Here’s the strategy that both can adopt in order to get a 1 in 6 chance of getting the reward: Each guesses that the others’ die lands the same way that their own die landed. So if my die lands 3, I guess that my friend’s die landed 3 as well. This strategy succeeds whenever the dice actually do land the same way. And what’s the chance of this? 6 out of 36, or 1 out of 6!

1 1       2 1       3 1       4 1       5 1       6 1
1 2       2 2       3 2       4 2       5 2       6 2
1 3       2 3       3 3       4 3       5 3       6 3
1 4       2 4       3 4       4 4       5 4       6 4
1 5       2 5       3 5       4 5       5 5       6 5
1 6       2 6       3 6       4 6       5 6       6 6

Proving the Completeness of Propositional Logic

The completeness of a logic is a really nice property to establish. For a logic to be complete, it must be that every semantic entailment is also syntactically entailed. Said more simply, it must be that every truth in the language is provable. Gödel’s incompleteness theorems showed us that we cannot have such high hopes for mathematics in general, but we can still establish completeness for some simple logics, such as propositional and first order logic.

I want to post a proof of the completeness of propositional logic here in full for future reference. Roughly the first half of what’s below is just establishing some necessary background, so that this post is fairly self-contained and doesn’t reference lemmas that are proved elsewhere.

The only note I’ll make before diving in is that the notation I(A,P) is a way to denote the smallest set that contains A and is closed under the operations in P. It’s a handy way to inductively define sets that would be enormously complicated to define otherwise. With that out of the way, here we go!

First we define the proof system for propositional logic.

Axiom 1: α→(β→α)
Axiom 2: (α→(β→γ)) → ((α→β)→(α→γ))
Axiom 3: ((¬α)→(¬β)) → (β→α)

The α, β, and γ symbols in these axioms are meant to stand for any well-formed formula. What this means is that we actually have a countable infinity of axioms that fall into the three categories above. For simplicity, I’ll keep calling them “Axioms 1, 2, and 3”, and assume you don’t find it too confusing.

You might also notice that the axioms only involve the symbols → and ¬, but neglect ∧ and ∨. This is okay because → and ¬ are adequate connectives for the semantics of propositional logic (which is to say that any truth function can be expressed in terms of them).

Axioms = {Axiom 1, Axiom 2, Axiom 3}
Deduction rule = Modus Ponens (MP)
….. MP(α, α→β) = β

The set of all provable sentences is just the set of all sentences that includes the axioms and is closed under modus ponens.

Theorems = I(Axioms, MP)

We can also easily talk about the set of sentences that can be proven from assumptions in a set Σ:

Th(Σ) = I(Axioms ∪ Σ, MP)
Notation: Σ ⊢ α iff α ∈ Th(Σ)

With that out of the way, let’s establish some basic but important results about the propositional proof system.

Monotonicity: If Σ ⊆ Σ’, then Th(Σ) ⊆ Th(Σ’).

Strong monotonicity: If Σ ⊢ Σ’, then Th(Σ’) ⊆ Th(Σ).

Intuitively, monotonicity says that if you expand the set of assumptions, you never shrink the set of theorems. Strong monotonicity says that if Σ can prove everything in Σ’, then Σ’ cannot be stronger than Σ. Both of these follow pretty directly from the definition of Th(Σ).

Soundness: If ⊢ α, then ⊨ α.
Proof by structural induction
….. Each axiom is a tautology.
….. Tautology is closed under MP (if ⊨ α and ⊨ (α→β), then ⊨ β).

Extended Soundness: If Σ ⊢ α, then Σ ⊨ α.
Proof by structural induction
….. Σ ⊨ α ∈ Axioms and Σ ⊨ α ∈ Σ.
….. If Σ ⊨ α and Σ ⊨ (α→β), then Σ ⊨ β.

Law of identity: ⊢ (α→α)
….. α→((α→α)→α), Axiom 1
….. (α→((α→α)→α))→((α→(α→α))→(α→α)), Axiom 2
….. (α→(α→α))→(α→α), MP
….. α→(α→α), Axiom 1
….. α→α, MP

Principle of Explosion: If Σ ⊢ α and Σ ⊢ (¬α), then Σ ⊢ β.
….. By strong monotonicity, it suffices to show that Σ ∪ {α} ∪ {¬α} ⊢ β.
……….. (¬α)→((¬β)→(¬α)), Axiom 1
……….. ¬α, Assumption
……….. (¬β)→(¬α), MP
……….. ((¬β)→(¬α))→(α→β), Axiom 3
.………. α→β, MP
……….. α, Assumption
……….. β, MP

And finally, our most important background theorem:

Deduction Theorem: Σ ⊢ (α→β) iff Σ ∪ {α} ⊢ β.
Proof =>
….. Suppose Σ ⊢ (α→β).
….. By monotonicity, Σ ∪ {α} ⊢ (α→β).
….. Also, clearly Σ ∪ {α} ⊢ α.
….. So Σ ∪ {α} ⊢ β.
Proof <=
….. Suppose Σ ∪ {α} ⊢ β.
….. Base cases
………. β ∈ Axioms. (β, β→(α→β), α→β).
………. β ∈ Σ. (β, β→(α→β), α→β).
………. β = α. (⊢ (α→α), so by monotonicity Σ ⊢ (α→α)).
….. Inductive step
………. Suppose Σ ⊢ (α→γ) and Σ ⊢ (α→(γ→𝛿)).
………. By strong monotonicity, suffices to show Σ ∪ {α→γ, α→(γ→𝛿)} ⊢ (α→𝛿)
……………. (α→(γ→𝛿)) → ((α→γ)→(α→𝛿)), Axiom 2
……………. α→(γ→𝛿), Assumption
……………. (α→γ)→(α→𝛿), MP
……………. (α→γ), Assumption
……………. (α→𝛿). MP

Now, let’s go into the main body of the proof. The structure of the proof is actually quite similar to the proof of the compactness theorem I gave previously. First we show that every consistent set of sentences Σ has a maximally consistent extension Σ’. Then show that Σ’ is satisfiable. Now since Σ’ is satisfiable and it’s an extension of Σ, Σ must also be satisfiable. From there it’s a simple matter to show that the logic is complete.

So, let’s define some of the terms I just used.

Σ is consistent iff for no α does Σ ⊢ α and Σ ⊢ (¬α)
….. Equivalently: iff for some α, Σ ⊬ α

Σ is maximally consistent iff Σ is consistent and for every α, either Σ ∪ {α} is inconsistent or Σ ⊢ α.

One final preliminary result regarding consistency before diving into the main section of the proof:

If Σ is satisfiable, then Σ is consistent.
Proof
….. Suppose Σ is inconsistent.
….. Then there’s an α such that Σ ⊢ α and Σ ⊢ (¬α).
….. By extended soundness, Σ ⊨ α and Σ ⊨ (¬α).
….. So Σ is not satisfiable.

This is the converse of the result we actually want, but it’ll come in handy. Now, let’s begin to construct our extension!

Any consistent Σ can be extended to a maximally consistent Σ’
….. Choose any ordering {αn} of well-formed-formulas.
….. Define Σ0 = Σ.
….. Σn+1 = Σn if Σn ⊢ (¬αn+1), and Σn ∪ {αn+1} otherwise.
….. For each n, (i) Σn is consistent and (ii) either Σn ⊢ αn or Σn ⊢ (¬αn)
……….. Base case: Σ0 is consistent by assumption, and (ii) doesn’t apply.
……….. Inductive step: Suppose Σn satisfies (i) and (ii). Two cases:
…………….. If Σn ⊢ (¬αn+1), then Σn+1 = Σn. Clearly consistent and satisfies (ii).
…………….. If Σn ⊬ (¬αn+1), then Σn+1 = Σn ∪ {αn+1}. Clearly satisfies (ii), but is it consistent?
………………….. Suppose not. Then Σn+1 ⊢ (¬αn+1), by explosion.
………………….. So Σn+1 ∪ {αn+1} ⊢ (¬αn+1).
………………….. So Σn+1 ⊢ (αn+1 → (¬αn+1)).
………………….. ⊢ ((α→¬α)→¬α), so Σn+1 ⊢ (¬αn+1). Contradiction!

….. Define Σ’ = ∪ Σn. Σ’ is maximally consistent.
……….. Maximality
…………….. Suppose not. Then for some αn, Σ’ ⊬ αn and Σ’ ∪ {αn} is consistent.
…………….. But Σn ⊆ Σ’, and either Σn ⊢ αn or Σn ⊢ (¬αn).
…………….. If Σn ⊢ αn, by monotonicity Σ ⊢ αn. Contradiction. So Σn ⊢ (¬αn).
…………….. By monotonicity, Σ ⊢ (¬αn), so Σ ∪ {αn} ⊢ (¬αn).
…………….. But Σ ∪ {αn} ⊢ αn. So Σ ∪ {αn} is inconsistent. Contradiction!
……….. Consistency
…………….. Suppose Σ’ is inconsistent. Then for some α, Σ’ ⊢ α and Σ’ ⊢ (¬α).
…………….. So there are proofs of α and (¬α) from Σ’.
…………….. Proofs are finite, so each proof uses only a finite number of assumptions from Σ’.
…………….. So we can choose an n such that Σn contains all the needed assumptions.
…………….. Now both proofs from Σ’ are also proofs from Σn.
…………….. So Σn ⊢ αn and Σ’ ⊢ (¬αn).
…………….. So Σn is inconsistent. Contradiction!

Alright, we’re almost there! So now we have that for any consistent Σ, there’s an extension Σ’ that is maximally consistent. We’ll take it a little further and prove that not only is Σ’ maximally consistent, it’s also complete! (This is the purely syntactic sense of completeness, which is that for every sentence α, either Σ’ proves α or refutes α. This is different from the sense of logical completeness that we’re establishing with the proof.)

Σ’ is complete.
….. Σn ⊆ Σ’, and Σn ⊢ αn or Σn ⊢ (¬αn).
….. So by monotonicity Σ’ ⊢ αn or Σ’ ⊢ (¬αn).

Now we have everything we need to show that Σ’, and thus Σ, is satisfiable.

If Σ is consistent, then Σ is satisfiable.
Proof
….. Let Σ’ be a maximally consistent extension of Σ.
….. Define vΣ’(p) over propositional variables p:
….. VΣ’(p) = T if Σ’ ⊢ p and F if Σ’ ⊬ p
….. ṼΣ’(α) = T iff Σ’ ⊢ α
……….. Base case: Let α be a propositional variable. Then ṼΣ’(α) = T iff Σ’ ⊢ α by definition of VΣ’.
……….. Inductive steps:
……….. (¬α)
…………….. If Σ’ ⊢ (¬α), then by consistency Σ’ ⊬ α, so ṼΣ’(α) = F, so ṼΣ’(¬α) = T.
…………….. If Σ’ ⊬ (¬α), then by completeness Σ’ ⊢ α. So ṼΣ’(α) = T, so ṼΣ’(¬α) = F.
……….. (α→β)
…………….. Suppose Σ’ ⊢ (α→β). By completeness Σ’ ⊢ α or Σ’ ⊢ (¬α).
………………….. If Σ’ ⊢ α, then Σ’ ⊢ β, so ṼΣ’(β) = T, so ṼΣ’(α→β) = T.
………………….. If Σ’ ⊢ (¬α), then ṼΣ’(α) = F, so ṼΣ’(α→β) = T.
……………. Suppose Σ’ ⊬ (α→β).
………………….. By completeness Σ’ ⊢ ¬(α→β).
………………….. ⊢ (β→(α→β)), so Σ’ ⊬ β on pain of contradiction. So ṼΣ'(β) = F.
………………….. Suppose ṼΣ’(α→β) = T. Then ṼΣ’(α) = F, so Σ’ ⊢ (¬α).
………………….. ⊢ (¬α→(α→β)). So Σ’ ⊢ (α→β). Contradiction.
………………….. So ṼΣ’(α→β) = F.
….. So vΣ’ satisfies Σ’.
….. Σ ⊆ Σ’, so vΣ’ satisfies Σ.
….. So Σ is satisfiable!

Now our final result becomes a four-line proof.

If Σ ⊨ α, then Σ ⊢ α.
Proof
….. Suppose Σ ⊬ α.
….. Then Σ ∪ {¬α} is consistent.
….. So Σ ∪ {¬α} is satisfiable.
….. So Σ ⊭ α.

And we’re done! We’ve shown that if any sentence α is semantically entailed by a set of sentences Σ, then it must also be provable from Σ! If you’ve followed this proof all the way, pat yourself on the back.

With the Completeness Theorem in hand, the proof of the Compactness Theorem goes from several pages to a few lines. It’s so nice and simple that I just have to include it here.

If Σ is finitely satisfiable, then Σ is satisfiable.

….. Suppose Σ is not satisfiable.
….. Then Σ is not consistent.
….. So there is some α for which Σ ⊢ α and Σ ⊢ (¬α).
….. Since proofs are finite, there must be some finite subset Σ* of Σ such that Σ* ⊢ α and Σ* ⊢ (¬α).
….. By soundness, Σ* ⊨ α and Σ* ⊨ (¬α).
….. So Σ* is not satisfiable!

In other words, if Σ is not satisfiable, then there’s some finite subset of Σ that’s also not satisfiable. This is the Compactness Theorem! Previously we proved it entirely based off of the semantics of propositional logic, but now we can see that it is also provable as a consequence of the finite nature of our proof system!

Sum and Product Puzzle

X and Y are two integers.

X < Y
X > 1
Y > 1
X + Y < 100

S and P are two perfect logicians. S knows X + Y, and P knows X × Y.

Everything I’ve just said is common knowledge. S and P have the following conversation:

S: “P, you don’t know X and Y”
P: “Now I do know X and Y!”
S: “And now so do I!”

What are X and Y?

Once you figure out that, here’s a question: If instead of saying that X + Y < 100, we say X + Y < N, then what’s the range of values of N for which this puzzle has a unique solution?

Four Pre-Gödelian Limitations on Mathematics

Even prior to the devastating Incompleteness Theorems there were hints of what was to come. I want to describe and prove four results in mathematical logic that don’t depend on Incompleteness at all, but establish some rather serious limitations on the project of mathematics.

Here are the four. I’ll go through them in order of increasing level of sophistication required to prove them.

  • Indescribable sets of possible worlds
  • Noncompossibility theorem
  • Inevitable nonstandard numbers
  • Mysterious missing subsets

1. Indescribable sets of possible worlds

I already talked about this one here. The basic idea is that even in our safest and least troublesome logic, propositional calculus, it turns out that the language is insufficient to fully capture all the semantic notions. It’s the first hint at something going awry with syntax and semantics, where the semantics can outpace the syntax and leave axiomatic mathematics behind.

So to recap: the result is that in propositional logic there are sets of truth assignments that can not be “described” by any set of propositional sentences (even allowing infinite sets!). A set of sentences is said to “describe” a set of truth assignments if that set of truth assignments is the unique set of truth assignments consistent with all those sentences being true. If we think of truth assignments as possible worlds, and sets of sentences as descriptions of sets of possible worlds, then this result says that there are sets of possible worlds in propositional semantics that cannot be described by any propositional syntax.

The proof of this is astoundingly simple: just look at the cardinality of the set of descriptions and the cardinality of the set of sets of possible worlds. The second is strictly larger than the first, so any mapping from descriptions to sets of possible worlds will of necessity leave some sets of possible worlds out. In fact, it also tells us that virtually all sets of possible worlds are not describable!

2. Noncompossibility Theorem

The noncompossibility theorem is a little-known theorem that establishes a serious limitation on our ability to describe mathematical structures. Here’s what it says. Suppose that you have a description of a countably infinite structure (like, say, the natural numbers, which have a countable infinity of objects) that has the following three properties:

(1) The language has a term for denoting every object in the structure (like 0, 1, 2, 3, 4, and so on)
(2) The axioms in your description are weakly complete: if something is inconsistent with the axioms, it can be proven false.
(3) There is some algorithm for determining whether any given sentence is an axiom.

The noncompossibility theorem tells us that if you have all three of these properties, then your axioms will fail to uniquely pick out your intended structure, and will include models that have extra objects that aren’t in the structure.

Let’s prove this.

We’ll denote the mathematical structure that we’re trying to describe as M and our language as L. We choose L to have sufficient syntactic structure to express the truths of M. From L, we select a decidable set of sentences X with the goal that all these sentences be true of M. We now select a proof system F in L such that for any finite extension L* to L involving only new constant terms, and for any Y ⊆ L*, if X ∪ Y is not satisfiable, then F refutes some finite subset of X ∪ Y.

(As an aside: Why care about this strange weak form of completeness? Well, intuitively all that it’s saying is that our axioms should be able to rule out any set of sentences that are inconsistent with them using some finite proof, as long as those sentences only use finitely many additional constant symbols. This is relatively weak compared to the usual notions of completeness that logicians talk about, which makes it an even better choice for our purposes, as the weaker the axiom the harder to deny.)

Our assumptions can now be written:

(0) |M| is countably infinite.
(1) ∀m ∈ M, ∃tm ∈ terms(X) such that (m = tm)
(2) If we extend L to L* by adding finitely many constant symbols, then for any Y ⊆ L*, if X ∪ Y is not satisfiable, then F refutes some finite subset of X ∪ Y.
(3) X is recursively enumerable.

Our proof starts by adding a new constant term c to our language and constructing an extension of X:

Y = X ∪ {c ≠ tm | m ∈ M}

In other words, Y is X but supplemented with the assertion that there exists an object that isn’t in M. If we can prove that Y is satisfiable, then this entails that X is also satisfiable by the same truth assignment. And this means that there is a model of X in which there are extra objects that aren’t in M.

We proceed with proof by contradiction. Suppose that Y is not satisfiable. Then, by assumption (2), we must be able to refute some finite subset Z of X ∪ Y. But since Z is finite, it involves only finitely many terms tm. And since M is countably infinite, there will always be objects in M that are not equal to any of the chosen terms! So we can’t refute any finite subset of Z! Thus Y is satisfiable.

And if Y is satisfiable, then so must be X, as Y is a superset of X. And since Y is satisfiable, then there’s some truth assignment v that satisfies all of v. But then v also satisfies X, as X is a subset of Y and removing axioms cannot rule out models, only add more! So we’ve proven that X has a model in which there is an object that is not equal to any of the objects in M. That is, X is not categorical: it does not uniquely describe M.

Tennant described this theorem as saying that “in countably infinite realms, you cannot know both where you are and where you are going.” More dully, we cannot have a satisfactory theory of a countably infinite mathematical structure that is both categorical and weakly complete. This isn’t super shocking by today’s standards, but it’s quite cool when you consider how little elaborate theoretical apparatus is required to prove it.

3. Inevitable nonstandard numbers

Suppose we have some first-order theory T that models the natural numbers. Take this theory and append to it a new constant symbol c, as well as an infinite axiom schema saying “c > 0” , “c > 1” , “c > 2”, and so on forever. Call this new theory T*.

Does T* have a model? Well by the compactness theorem, it has a model as long as all its finite subsets have a model. And for every finite subset of T*s axioms, the natural numbers are a model! So T* does have a model. Could this model be the natural numbers? Clearly not, because to satisfy T*, there must be a number greater than all the natural numbers. So whatever the model of T* is, it’s not the standard natural numbers. Let’s call it a nonstandard model, and label it ℕ*.

Here’s the final step of the proof: ℕ* is a model of T*, and T* is a superset of T, so ℕ* must also be a model of T! And thus we find that in any logic with a compactness theorem, a theory of the natural numbers will have models with nonstandard numbers that are greater than all of ℕ.

It’s one of my favorite proofs, because it’s so easy to describe and has such a devastating conclusion. It’s also an example of the compactness theorem using the existence of one type of model (ℕ for each of the finite cases) to prove the existence of something entirely different (ℕ* for the infinite case).

4. Mysterious missing subsets

The Löwenheim-Skolem theorem tells us that if a first-order theory has a model with an infinite cardinality, then it has models with every infinite cardinality. This places a major restriction on our ability to describe an infinite mathematical structure using first order logic. For if we were to try to single out the natural numbers, say, we would inevitably end up failing to rule out models of our axioms that are the cardinality of the real numbers, or worse, or the set of functions from real numbers to real numbers, and so on for all other possible cardinalities.

When applied to set theory, this implies a result that seems on its face to be a straightforward contradiction. Namely, Löwenheim-Skolem tells us that any first order axiomatization of sets will inevitably have a model that contains only a countable infinity of sets. But this seems bizarre, as all we appear to need to rule out countably infinite universes of sets is one axiom that asserts the existence of a countably infinite set, and another that asserts that admits the power-set of any set to the universe of sets. Then we will be forced to admit that there is a set which is the power set of a countably infinite set, and as Cantor’s famous diagonal argument shows, that this set is uncountably large.

So on the one hand, Cantor tells us that there are sets that contain uncountably many objects. And on the other hand, Löwenheim-Skolem tell us that there is a model of set theory with only countably many objects. This dichotomy is known by the name Skolem’s paradox. It appears to be a straightforward contradiction, but it’s not.

What Skolem realized was that the formal notion of a power set, which is something like “the set P(X) such that for all sets Y, if Y is a subset of X, then Y is an element P(X)”, relies on a quantification over all sets, and that in a countable universe of sets, that quantification ranges over only a countable number of objects. In other words, P(X) is only uncountable if our quantifier ranges over all possible sets, but for a countable model, there are sets that are not describable within the model. This means that the notion of a power set is relative to your model of set theory! In fact, there’s no way in first order logic to unambiguously pin down what you mean by “power set” in such a way that all models will agree on what P(X) actually contains. It also means that the notions of cardinality and countability are relative to your model! In Skolem’s words, “even the notions ‘finite’, ‘infinite’, ’simply infinite sequence’ and so forth turn out to be merely relative within axiomatic set theory.”

A proof of the Compactness Theorem

The Compactness Theorem is one of the most powerful results in mathematical logic, and I want to prove it for you. Fair warning, the proof is not a walk in the park, and you probably won’t comprehend it if you speed through. Take your time! This is the easiest proof of Compactness that I know of which doesn’t rely on the Completeness Theorem, which is itself not easy to prove.

First, some background concepts:

Interpretation An interpretation of a set of sentences Σ is a consistent assignment of truth values to all well-formed-formulas such that every sentence in Σ is assigned true.

Logical Entailment Σ logically entails α (Σ ⊧ α) iff in every interpretation v of Σ, v(α) = True.

Completeness A set of sentences Σ is complete iff for every well-formed-formula α, Σ ⊧ α or Σ ⊧ ¬α.

Satisfiability A set of sentences Σ is satisfiable iff there is at least one interpretation under which all sentences in Σ are true.

Finite Satisfiability A set of sentences Σ is finitely satisfiable iff every finite subset of Σ is satisfiable.

Compactness Theorem Σ is satisfiable iff every finite subset of Σ is satisfiable.

With that aside, here’s an outline of the proof. I’ve adapted the outline from this lecture series, and added in some minor improvements.

There are five steps, and most of the work is in the middle three.

  1. Establish that there is an ordering on the set of well-formed-formulas.
  2. Prove that if Σ is finitely satisfiable, then so is either Σ ∪ {α} or Σ ∪ {¬α}.
  3. Construct Δ, an extension of Σ, which is complete and finitely satisfiable.
  4. Prove that Δ is satisfiable.
  5. Prove that Σ is satisfiable.

This will prove that Σ is satisfiable if it’s finitely satisfiable. The other direction (if it’s satisfiable then it’s finitely satisfiable) is trivial, as any subset of a satisfiable set is also satisfiable.

So let’s dive in!

1. There is an ordering on the set of well-formed-formulas.

This is a fun fact that is basically established by Gödel numbering. Assign numbers to all the symbols of your language. An example assignment for propositional logic:

“(” assigned 0
“)” assigned 1
“¬” assigned 2
“∧” assigned 3
“∨” assigned 4
“→” assigned 5
“P0” assigned 6
“P1” assigned 7
“P2” assigned 8
And so on…

Now we translate a string by going through symbol by symbol and making the number assigned to the nth symbol in the string the exponent of the nth prime. Here are a few translations to illustrate:

Initial string: “(¬P0)”
Codes for each character: [0, 2, 6, 1]
Gödel number: 20 32 56 71 = 984,375

Initial string: “(P0 → P2)”
Codes for each character: [0, 6, 5, 8, 1]
Gödel number: 20 36 55 78 111 = 144,462,310,059,375

Initial string: “((P0 ∨ P2) → (¬P1))”
Codes for each character: [0, 0, 6, 4, 8, 1, 5, 0, 2, 7, 1]
Gödel number: 20 30 56 74 118 131 175 190 232 297 311 ≈ 4.2 × 1037

Since the prime factorization of a number is unique, we have a unique number for every possible string, which means we can go backwards from any number to its corresponding string. Now to construct our order, we just start from 2 and work our way up, discarding any sentences we get that are not well-formed-formulas. This will give us an ordered list of all well-formed formulas!

2. If Σ is finitely satisfiable, then so is either Σ ∪ {α} or Σ ∪ {¬α}.

Suppose not. Then neither Σ ∪ {α} nor Σ ∪ {¬α} are finitely satisfiable. So there must exist some finite sets Σ0 and Σ1 such that Σ0 ⊆ Σ ∪ {α} and Σ1 ⊆ Σ ∪ {¬α}, where neither Σ0 and Σ1 are satisfiable.

But now consider Σ’ = (Σ0 ∪ Σ1) \ {α, ¬α}. Σ’ is a finite subset of Σ, so it must be satisfiable (remember, Σ is finitely satisfiable by assumption). This means that there is some interpretation which satisfies Σ’. But this interpretation must assign to α either T or F.

If it assigns T to α, then Σ’ ∪ {α} is satisfiable. But Σ’ ∪ {α} is a superset of Σ0! So Σ0 must also be satisfiable. But this contradicts our assumption!

But if it assigns F to α, then Σ’ ∪ {¬α} is satisfiable. And Σ’ ∪ {¬α} is a superset of Σ1! So Σ1 must also be satisfiable. Contradiction!

Either way we get a contradiction, so this proves our theorem.

3. Extend Σ to Δ, which is complete and finitely satisfiable.

We inductively define Δ as follows:

Δ0 = Σ
Δn+1 = Δn ∪ {αn} if Δn ∪ {αn} is finitely satisfiable, otherwise Δn ∪ {¬αn}
Δ = ∪ Δn

Here we’re using our ordering that we constructed in Step 1 to go through every well-formed-formula in order. This establishes that Δ is complete, as we construct it by appending to Σ either α or ¬α for every well-formed-formula. I.e. we construct it by taking Σ and adding to it an explicit “opinion” on every possible sentence, ensuring at each stage that we remain finitely satisfiable.

How do we know that the final result Δ is also finitely satisfiable? This is due to our proof in Step 2, which allows us to say that Δn is always finitely satisfiable for any n. But any finite subset of Δ is also a subset of Δn for some n! So any finite subset of Δ is satisfiable. Thus Δ is finitely satisfiable.

4. Δ is satisfiable.

To prove this, we’ll hand-make a truth assignment for our purposes. First we define a truth assignment v over atomic propositions as follows:

v(p) = T if p ∈ Δ, otherwise F

Now we extend v to the rest of the set of well-formed-formulas in the usual way, ensuring consistency of the truth-assignment.

We’ll prove now that under this extension, v(α) = T if and only if α ∈ Δ. The proof is by structural induction.

Base case

  • α is an atomic proposition. Then v(α) = T iff α ∈ Δ, by definition of v.

Inductive step(s)

  • If α = ¬β, then v(α) = v(¬β) = T iff ¬β ∈ Δ iff α ∈ Δ.
  • If α = (β ∧ γ), then v(α) = v(β ∧ γ) = T iff (v(β) = T and v(γ) = T) iff (β ∈ Δ and γ ∈ Δ) iff (β ∧ γ) ∈ Δ iff α ∈ Δ
    • If β ∈ Δ and γ ∈ Δ, then (β ∧ γ) ∈ Δ, since Δ is complete and finitely satisfiable.
      • If (β ∧ γ) ∉ Δ, then ¬(β ∧ γ) ∈ Δ by completeness. But then {β, γ, ¬(β ∧ γ)} is a finite subset of Δ that’s not satisfiable, so Δ isn’t finitely satisfiable. Proof by contradiction.
    • If β ∉ Δ or γ ∉ Δ, then (β ∧ γ) ∉ Δ, since Δ is complete and finitely satisfiable.
      • Suppose (β ∧ γ) ∈ Δ. Since β ∉ Δ or γ ∉ Δ, at least one of ¬β and ¬γ must be in Δ. Without loss of generality, assume ¬β ∈ Δ. Then {¬β, (β ∧ γ)} is a finite subset of Δ that’s not satisfiable, so Δ isn’t finitely satisfiable. Proof by contradiction.

Since ¬ and ∧ are a complete set of truth functions (all other well-formed formulas can be converted into a form that uses only ¬ and ∧), this proves the proposition for all well-formed formulas.

We’ve shown that v(α) = T if and only if α ∈ Δ. This proves that v satisfies Δ!

5. Σ is satisfiable.

The final step is the easiest one. Δ is satisfied by v, and Δ is a superset of Σ, so Σ must also be satisfied by v. So we’re done!

A challenge to constructivists

Constructive mathematicians do not accept a proof of existence unless it provides a recipe for how to construct the thing whose existence is being asserted. Constructive mathematics is quite interesting, but it also appears to have some big problems. Here’s a challenge for constructivists:

Suppose that I hand you some complicated function f from a set A to another set B. I ask you: “Can every element in B be reached by applying the function to an element in A?” In other words, is f surjective?

Now, it so happens that the cardinality of B is greater than the cardinality of B. That’s sufficient to tell us that f can’t be surjective, as however it maps elements there will always be some left over. So we know that the answer is “no, we can’t reach every element in B.” But we proved this without explicitly constructing the particular element in B that can’t be reached! So a constructivist will be left unsatisfied.

The trick is that I’ve made this function extremely complicated, so that there’s no clever way for them to point to exactly which element is missing. Would they say that even though |B| is strictly larger than |A|, it could still be somehow that every element in B is in the image of f? Imagine asking them to bet on this proposition. I don’t think any sane person would put any money on the proposition that f is onto.

And as a final kicker, our sets don’t even have to be infinite! Let |A| = 20 and |B| = 21. I describe a function from A to B, such that actually computing the “missing element” involves having to calculate the 21st Busy Beaver number or something. And the constructivist gets busy searching for the particular element in B that doesn’t get mapped to, instead of just saying “well of course we can’t map 20 elements to 21 elements!”

Even simpler, let F map {1} to {1,2} as follows: F(1) = 1 if the last digit of the 20th busy beaver number is 0, 1, 2, 3, or 4, and F(1) = 2 otherwise. Now to prove constructively that there is an element in {1, 2} that isn’t in the image of F requires knowing the last digit of the 20th busy beaver number, which humans will most likely never be able to calculate (we’re stuck on the fifth one now). So a constructivist will be remain uncertain on the question of if F is surjective.

But a sane person would just say “look, of course F isn’t surjective; it maps one object to two objects. You can’t do this without leaving something out! It doesn’t matter if we don’t know which element is left out, it has to be one of them!”

And if humanity is about to meet an alien civilization with immense computational power that knows all the digits of the 20th busy beaver number, the standard mathematician could bet their entire life savings on F not being surjective at any odds whatsoever, and the constructive mathematician would bet in favor of F being surjective at some odds. And of course, the constructivist would be wrong and lose money! So this also means that you have a way to make money off of any constructivist mathematicians you encounter, so long as we’re about to make contact with advanced aliens.

Describing the world

Wittgenstein starts his Tractatus Philosophicus with the following two sentences.

1. The world is everything that is the case.

1.1 The world is the totality of facts, not of things.

Let’s take him up on this suggestion and see how far we get. In the process, we’ll discover some deep connections to theorems in mathematical logic, as well as some fascinating limitations on the expressive powers of propositional and first order logic.

We start out with a set of atomic propositions. For a very simple world, we might only need a finite number of these: “Particle 1 out of 3 has property 1 out of 50”, “Particle 2 of 3 has property 17 out of 50”, and so on. More realistically, the set of atomic propositions will be infinite (countable if the universe doesn’t have any continuous properties, and uncountable otherwise).

For simplicity, we’ll imagine labeling our set of atomic propositions P1, P2, P3, and so on (even though this entails that there are at most countably many, nothing important will rest on this assumption.) We combine these atomic propositions with the operators of propositional logic {(, ), ¬, ∧, ∨, →}. This allows us to build up more complicated propositions, like ((P7∧P2)→(¬P13)). This will be the language that we use to describe the world.

Now, the way that the world is is just a consistent assignment of truth values to the set of all grammatical sentences in our language. For example, one simple assignment of truth values is the one that assigns “True” to all atomic propositions. Once we’ve assigned truth values to all the atomic propositions, we get the truth values for the rest of the set of grammatical sentences for free, by the constraint that our truth assignment be consistent. (For instance, if P1 and P2 are both true, then (P1∧P2) must also be true.)

Alright, so the set of ways the world could be corresponds to the set of truth assignments over our atomic propositions. The final ingredient is the notion that we can encode our present knowledge of the world as a set of sentences. Maybe we know by observation that P5 is true, and either P2 or P3 is true but not both. Then to represent this state of knowledge, we can write the following set of sentences:

{P5, (P2∨P3), ¬(P2∧P3)}

Any set of sentences picks out a set of ways the world could be, such that each of these possible worlds is compatible with that knowledge. If you know nothing at all, then the set of sentences representing your knowledge will be the empty set {}, and the set of possible worlds compatible with your knowledge will be the set of all possible worlds (all possible truth assignments). On the other extreme, you might know the truth values of every atomic proposition, in which case your state of knowledge uniquely picks out one possible world.

In general, as you add more sentences to your knowledge-set, you cut out more and more possible worlds. But this is not always true! Ask yourself what the set of possible worlds corresponding to the set {(P1∨¬P1), (P2∨¬P2), (P3∨¬P3)} is. Since each of these sentences is a tautology, no possible worlds are eliminated! So our set of possible worlds is still the set of all worlds.

Now we get to an interesting question: clearly for any knowledge-set of sentences, you can express a set of possible worlds consistent with that knowledge set. But is it the case that for any set of possible worlds, you can find a knowledge-set that uniquely picks it out? If I hand you a set of truth assignment functions and ask you to tell me a set of propositions which are consistent with that set of worlds and ONLY that set, is that always possible? Essentially, what we’re asking is if all sets of possible worlds are describable.

We’ve arrived at the main point of this essay. Take a minute to ponder this and think about whether it’s possible, and why/why not! For clarification, each sentence can only be finitely long. But! You’re allowed to include an infinity of sentences.

(…)

(Spoiler-hiding space…)

(…)

If there were only a finite number of atomic propositions, then you could pick out any set of possible worlds with just a single sentence in conjunctive normal form. But when we start talking about an infinity of atomic propositions, it turns out that it is not always possible! There are sets of possible worlds that are literally not describable, even though our language includes the capacity to describe each of those words and we’re allowed to include an infinite set of sentences.

There’s a super simple proof of this. Let’s give a name to the cardinality of the set of sentences: call it K. (We’ve been tacitly acting as if the cardinality is countable this whole time, but that doesn’t actually matter.) What’s the cardinality of the set of all truth assignments?

Well, each truth assignment is a function from all sentences to {True, False}. And there are 2K such assignments. 2K is strictly larger than K, so there are more possible worlds than there are sentences. Now, the cardinality of the set of sets of sentences is also 2K. But the set of SETS of truth of assignments is 22^K!

What this means is that we can’t map sets of sentences onto sets of truth assignments without leaving some things out! This proof carries over to predicate logic as well. The language for both propositional and predicate logic is unable to express all sets of possible worlds corresponding to that language!

I love this result. It’s the first hint in mathematical logic that syntax and semantics can come apart.

That result is the climax of this post. What I want to do with the rest of this post is to actually give an explicit example of a set of truth assignments that are “indescribable” by any set of sentences, and to prove it. Warning: If you want to read on, things will get a bit more technical from here.

Alright, so we’ll use a shortcut to denote truth assignments. A truth assignment will be written as a string of “T”s and “F”s, where the nth character corresponds to how the truth assignment evaluates Pn. So the all-true truth assignment will just be written “TTTTTT…” and the all-false truth assignment will be written “FFFFF…”. The truth assignment corresponding to P1 being false and everything else true will be written “FTTTTT…”. And so on.

Now, here’s our un-describable set of truth assignments. {“FFFFFF…”, “TFFFFF…”, “TTFFFF…”, “TTTFFF…”, …}. Formally, define Vn to be the truth assignment that assigns “True” to every atomic proposition up to and including Pn, and “False” to all others. Now our set of truth assignments is just {Vn | n ∈ ℕ}.

Let’s prove that no set of sentences uniquely picks out this set of truth assignments. We prove by contradiction. Suppose that we could find a set of sentences that uniquely pick out these truth assignments and none other. Let’s call this set A. Construct a new set of sentences A’ by appending all atomic propositions A: A’ = A ∪ {P1, P2, P3, …}.

Is there any truth assignment that is consistent with all of A’? Well, we can answer this by using the Compactness Theorem: A’ has a truth assignment if and only if every finite subset of A’ has a truth assignment. But every finite subset of A’ involves sentences from A (which are consistent with Vn for each n by assumption), and a finite number of atomic propositions. Since each finite subset of A’ is only asserting the truth of a finite number of atomic sentences, we can always find a truth assignment Vk in our set that is consistent with it, by choosing one that switches to “False” long after the last atomic proposition that is asserted by our finite subset.

This means that each finite subset of A’ is consistent with at least one of our truth assignments, which means that A’ is consistent with at least one of our truth assignments. But A’ involves the assertion that all atomic propositions are true! The only truth assignment that is consistent with this assertion is the all-true assignment! And is that truth assignment in our set? No! And there we have it, we’ve reached our contradiction!

We cannot actually describe a set of possible worlds in which either all atomic propositions are false, or only the first is true, or only the first two are true, or only the first three are true, and so on forever. But this might prompt the question: didn’t you just describe it? How did you do that, if it’s impossible? Well, technically I didn’t describe it. I just described the first four possibilities and then said “and so on forever”, assuming that you knew what I meant. To have actually fully pinned down this set of possible worlds, I would have had to continue with this sentence forever. And importantly, since this sentence is a disjunction, I could not split this infinite sentence into an infinite set of finite sentences. This fundamental asymmetry between ∨ and ∧ is playing a big role here: while an infinite conjunction can be constructed by simply putting each clause in the conjunction as a separate sentence, an infinite disjunction cannot be. This places a fundamental limit on the ability of a language with only finite sentences to describe the world.

The full solution to the dog puzzle

A couple of days ago I posted A logic puzzle about dogs. Read that post first and try solving it before reading on!

Below is the full explanation of the puzzle. Heavy spoilers, obviously.

The dog society is structured identically to the Robinson axioms for natural number arithmetic. Dogs are numbers, Spot is 0, the alpha of n is n + 1, the referee for n and m is n + m, the counselor for n and m is n × m, and the strength relation is the < relation. This means that “the marriage counselor for Spot’s alpha and the referee of a fight between Spot’s alpha and Spot’s alpha” is translated to 1 × (1 + 1) = 2, which is Spot’s alpha’s alpha. In Robinson arithmetic, you can also prove that ∀n (n < n + 1).

As for the question of if it’s possible for a dog to be stronger than Spot, Spot’s alpha, Spot’s alpha’s alpha, and so on: The primary difference between Robinson arithmetic and Peano arithmetic (the usual axiomatization of natural numbers) is that the Robinson axioms have no induction axiom (which would be something like “If Spot has a property, and if the alpha of any dog with the property also has the property, then all dogs have the property”). The induction axiom serves to rule out many models of the axioms that are not actually the natural numbers.

If the induction axiom is stated using second-order logic, then the axiom system uniquely pins down (ℕ,+,×,>) and there are no dogs besides those in Spot’s hierarchy. But the induction axiom cannot be stated as a single axiom in first order logic, since it involves quantifying over all properties. For first-order Peano arithmetic, we instead have an infinite axiom schema, one for each property that is definable within the first-order language. This turns out to be strictly weaker than the single second-order axiom, as there are some properties of numbers that cannot be described in a first-order language (like being larger than a finite number of numbers).

What this amounts to is that first-order Peano arithmetic with its infinite axiom schema is too weak to pin down the natural numbers as a unique model. There are what’s called nonstandard models of first order PA, which contains the ordinary numbers but also an infinity of weird extra numbers.(In fact, there exist models of first-order PA with every infinite cardinality!) Some of these numbers have the property that they are bigger than all the regular natural numbers.And since Robinson arithmetic is strictly weaker than first-order PA (lacking an induction axiom as it does), this means that Robinson arithmetic is also not strong enough to rule out numbers greater than all elements of ℕ. Which means that we cannot prove that there are no dogs stronger than every dog in Spot’s hierarchy!

I made this puzzle to illustrate three things: First, how the same axioms, and even the same models of the same axioms, can have wildly different interpretations, and that a shift in how you think about these axioms can make seemingly impossible tasks trivial. Second, how axiom systems for structures like ℕ can (and inevitably do) fail to capture important and intuitively obvious features of the structure. And third, how logic is so interesting! Just trying to create a simple rule system to describe one of the most natural and ordinary mathematical structures that we ALL use ALL THE TIME for reasoning about the world, turns out to be so nontrivial, and in fact impossible to do perfectly!