On philosophical progress

April 3, 2020April 3, 2020 ~ ~ 2 Comments

A big question in the philosophy of philosophy is whether philosophers make progress over time. One relevant piece of evidence that gets brought up in these discussions is the lack of consensus on age old questions like free will, normative ethics, and the mind body problem. If a discipline is progressing steadily towards truth with time, the argument goes, then we should expect that questions that have been discussed for thousands of years should be more or less settled by now. After all, that is what we see in the hard sciences; there are no lingering disputes over the validity of vitalism or the realm of applicability of Newtonian mechanics.

There are a few immediate responses to this line of argument. It might be that the age old questions of philosophy are simply harder than the questions that get addressed by physicists or biologists. “Harder” doesn’t mean “requires more advanced mathematics to grapple with” here, but something more like “it’s unclear what even would count as a conclusive argument for one position or another, and therefore much less clear how to go about building consensus.” Try to imagine what sort of argument would convince you of the nonexistence of libertarian free will with the same sort of finality as a demonstration of time dilation convinces you of the inadequacy of nonrelativistic mechanics.

A possible rejoinder at this point would be to take after the logical positivists and deny the meaningfulness or at least truth-aptness of the big questions of philosophy as a whole. This may go too far; it may well be that a query is meaningful but, due to certain epistemic limitations of ours, forever beyond our ability to decide. (We know for sure that such queries can exist, due to Gödelian discoveries in mathematics. For instance, we know of the existence of a series of numbers that are perfectly well defined, but for which no algorithm can exist to enumerate all of them. The later numbers in this sequence will forever be a mystery to us, and not for lack of meaningfulness.)

I think that the roughly correct position to take is that science is largely about examining empirical facts-of-the-matter, whereas philosophy is largely about analyzing and refining our conceptual framework. While we have a fairly clear set of standards for how to update theories about the empirical world, we are not in possession of such a set of standards for evaluating different conceptual frameworks. The question of “what really are the laws governing the behavior of stuff out there” has much clearer truth conditions than a question like “what is the best way to think about the concepts of right and wrong”; i.e. It’s clearer what counts as a good answer and what counts as a bad answer.

When we’re trying to refine our concepts, we are taking into account our pre-theoretical intuitions (e.g. any good theory of the concept of justice must have something to do with our basic intuitive conception of justice). But we’re not just satisfied to describe the concept solely as the messy inconsistent bundle of intuitions that constitute our starting position on it. We also aim to describe the concept simply, by developing a “theory of justice” that relies on a small set of axioms and from which (the hope is) the rest of our conclusions about justice follow. We want our elaboration of the concept to be consistent, in that we shouldn’t simultaneously affirm that A is an instance of the concept and that A is not an instance of the concept. Often we also want our theory to be precise, even when the concept itself has vague boundaries.

Maybe there are other standards besides these, intuitiveness, simplicity, consistency, and precision. And the application of these standards is very rarely made explicit. But one thing that’s certain is that different philosophers have different mixes of these values. One philosopher might value simplicity more or less than another, and it’s not clear that one of them is doing something wrong by having different standards. Put another way, I’m not convinced that there is one unique right set of standards for conceptual refinement.

We may want to be subjectivists to some degree about philosophy, and say that there are a range of rationally permissible standards for conceptual refinement, none better than any other. This would have the result that on some philosophical questions, multiple distinct answers may be acceptable but some crazy enough answers are not. Maybe compatibilism and nihilism are acceptable stances on free will but libertarianism is not. Maybe dualism and physicalism are okay but not epiphenomenalism. And so on.

This view allows for a certain type of philosophical progress, namely the gradual ruling out of some philosophical positions as TOO weird. It also allows for formation of consensus, through the discovery of philosophical positions that are the best according to all or most of the admissible sets of standards. I think that one example of this would be the relatively recent rise of Bayesian epistemology in philosophy of science, and in particular the Bayesian view of scientific evidence as being quantified by the Bayes factor. In brief, what does it mean to say that an observation O gives evidence for a hypothesis H? The Bayesian not only has an answer to this, but to the more detailed question of to what degree O gives evidence for H. The quantity is cr(O | H) / cr(O), where cr(.) is a credence function encoding somebody’s beliefs before observing O. If this quantity is equal to 1, then O is no evidence for H. If it is greater than 1, then O is evidence for H. And if it’s less than 1, then O is evidence against H.

Not everything in Bayesian epistemology is perfectly uncontroversial, but I would argue that on this particular issue – the issue of how to best formalize the notion of scientific evidence – the Bayesian definition survives all its challenges unscathed. What are some other philosophical questions on which you think there has been definite progress?

A Gödelian Logic Puzzle

March 11, 2020March 11, 2020 ~ ~ Leave a comment

There’s an island on which there lives exactly two types of people: truthers and liars. Truthers always say true statements, and liars always say false statements. One day a brilliant logician comes to visit the island. The logician knows all of the above-stated facts about the island. It also happens that the logician is a perfectly sound reasoner – he never proves anything that is false.

The logician encounters an individual named ‘Jal’ that lives on the island. The logician knows that Jal lives on the island, and so is either a truther or a liar. Now, Jal makes a statement from which it logically follows that Jal is a truther. But the logician could never possibly prove that Jal is a truther! (Remember, we never asserted that the logician proves all true things, just that the logician proves only true things). What type of statement could accomplish this?

This puzzle is from a paper by Raymond Smullyan on mathematical logic. Try to answer it for yourself before reading on!

(…)

Alright, so here’s one possible answer. Jal could say to the logician: “You will never prove that I am a truther.” I claim that this sentence logically entails that Jal is a truther, and yet the logician cannot possibly prove it.

First of all, why does it entail that Jal is a truther? Let’s prove it by contradiction. Suppose that Jal is not a truther. Then, since Jal is either a truther or a liar, Jal must be a liar. That means that every statement Jal makes must be false. So in particular, Jal’s statement that “you will never prove that I am a truther” must be false. This entails that the logician must eventually prove that Jal is a truther. But we assumed that Jal isn’t a truther! So the logician must eventually prove a falsehood. But remember, we assumed that our logician’s proofs were always sound, so that he will never prove a falsehood. So we have a contradiction.

Therefore, Jal is a truther.

Now, why can the logician not prove that Jal is a truther? This can be seen very straightforwardly: we just proved that Jal is a truther, which means that all of Jal’s statements must be true. So in particular, Jal’s statement that “you will never prove that I am a truther” must be true. So in other words, it’s true that the logician will never prove that Jal is a truther!

So there you have it, a statement that appears to satisfy both of the criteria!

But now the next question I have for you is a bit trickier. It appears from the line of reasoning above that we have just proven that Jal is a truther. So why couldn’t the logician just run through that exact same line of reasoning? It appears to be perfectly valid, and to use nothing more advanced than basic predicate logic.

But if the logician does go through that line of reasoning, then he will conclude that Jal is a truther, which will make Jal’s statement false, which is a contradiction! So we’ve gone from something which was maybe just unintuitive to an actual paradox. Can you see how to resolve this paradox? (Again, see if you can figure it out yourself before reading on!)

(…)

Okay, so here’s the resolution. If we say that the logician can go through the same line of reasoning as us, then we reach a contradiction (that a truther tells a false statement). So we must deny that the logician can go through the same line of reasoning as us. But why not? As I said above, the reasoning is nothing more complicated than basic predicate logic. So it’s not that we’re using some magical supernatural rules of inference that no mortal logician could get his hands on. It must be that one of the assumptions we used in the argument is an assumption that the logician cannot use.

So look back through the argument, and carefully consider each of the assumptions we used:

First of all, why does it entail that Jal is a truther? Let’s prove it by contradiction. Suppose that Jal is not a truther. Then, since Jal is either a truther or a liar, Jal must be a liar. That means that every statement Jal makes must be false. So in particular, Jal’s statement that “you will never prove that I am a truther” must be false. This entails that the logician must eventually prove that Jal is a truther. But we assumed that Jal isn’t a truther! So the logician must eventually prove a falsehood. But remember, we assumed that our logician’s proofs were always sound, so that he will never prove a falsehood. So we have a contradiction.

In order, we made use of the assumptions that (1) Jal is either a truther or a liar, (2) every statement made by a liar is false, and (3) the logician is a sound reasoner.

I told you at the beginning that facts (1) through (2) are all known to the logician, but I did not say the same of (3)! The logician can only run through this argument if he knows that he is a sound reasoner (that he only proves true things). And this is the problem assumption, which must be rejected.

It’s not that no logician can actually ever be sound (a logician who only ever reasons in first order logic and nothing more fancy would be sound). It’s that the logician, though he really is sound, cannot know himself to be sound. In other words, no sound system can prove its own soundness!

This is very similar to Gödel’s second incompleteness theorem. The only proof system which can assert its own consistency is an inconsistent proof system, and the only type of logician that can prove his own soundness will end up being unsound. Here’s the argument that the logician might make if they believe in their own soundness:

Supposing Jal is a liar, then his statement is false, so I could eventually prove that he is a truther. But then I’d have proven something false, which I know I can never do, so Jal must not be a liar. So he must be a truther.

Since the logician has now produced a proof that Jal is a truther, Jal’s statement is false. This means that Jal cannot be a truther, so the logician has proven a false statement!

Proving the Completeness of Propositional Logic

February 27, 2020April 16, 2026 ~ ~ 2 Comments

The completeness of a logic is a really nice property to establish. For a logic to be complete, it must be that every semantic entailment is also syntactically entailed. Said more simply, it must be that every truth in the language is provable. Gödel’s incompleteness theorems showed us that we cannot have such high hopes for mathematics in general, but we can still establish completeness for some simple logics, such as propositional and first order logic.

I want to post a proof of the completeness of propositional logic here in full for future reference. Roughly the first half of what’s below is just establishing some necessary background, so that this post is fairly self-contained and doesn’t reference lemmas that are proved elsewhere.

The only note I’ll make before diving in is that the notation I(A,P) is a way to denote the smallest set that contains A and is closed under the operations in P. It’s a handy way to inductively define sets that would be enormously complicated to define otherwise. With that out of the way, here we go!

First we define the proof system for propositional logic.

Axiom 1: α→(β→α)
Axiom 2: (α→(β→γ)) → ((α→β)→(α→γ))
Axiom 3: ((¬α)→(¬β)) → (β→α)

The α, β, and γ symbols in these axioms are meant to stand for any well-formed formula. What this means is that we actually have a countable infinity of axioms that fall into the three categories above. For simplicity, I’ll keep calling them “Axioms 1, 2, and 3”, and assume you don’t find it too confusing.

You might also notice that the axioms only involve the symbols → and ¬, but neglect ∧ and ∨. This is okay because → and ¬ are adequate connectives for the semantics of propositional logic (which is to say that any truth function can be expressed in terms of them).

Axioms = {Axiom 1, Axiom 2, Axiom 3}
Deduction rule = Modus Ponens (MP)
….. MP(α, α→β) = β

The set of all provable sentences is just the set of all sentences that includes the axioms and is closed under modus ponens.

Theorems = I(Axioms, MP)

We can also easily talk about the set of sentences that can be proven from assumptions in a set Σ:

Th(Σ) = I(Axioms ∪ Σ, MP)
Notation: Σ ⊢ α iff α ∈ Th(Σ)

With that out of the way, let’s establish some basic but important results about the propositional proof system.

Monotonicity: If Σ ⊆ Σ’, then Th(Σ) ⊆ Th(Σ’).

Strong monotonicity: If Σ ⊢ Σ’, then Th(Σ’) ⊆ Th(Σ).

Intuitively, monotonicity says that if you expand the set of assumptions, you never shrink the set of theorems. Strong monotonicity says that if Σ can prove everything in Σ’, then Σ’ cannot be stronger than Σ. Both of these follow pretty directly from the definition of Th(Σ).

Soundness: If ⊢ α, then ⊨ α.
Proof by structural induction
….. Each axiom is a tautology.
….. Tautology is closed under MP (if ⊨ α and ⊨ (α→β), then ⊨ β).

Extended Soundness: If Σ ⊢ α, then Σ ⊨ α.
Proof by structural induction
….. Σ ⊨ α ∈ Axioms and Σ ⊨ α ∈ Σ.
….. If Σ ⊨ α and Σ ⊨ (α→β), then Σ ⊨ β.

Law of identity: ⊢ (α→α)
….. α→((α→α)→α), Axiom 1
….. (α→((α→α)→α))→((α→(α→α))→(α→α)), Axiom 2
….. (α→(α→α))→(α→α), MP
….. α→(α→α), Axiom 1
….. α→α, MP

Principle of Explosion: If Σ ⊢ α and Σ ⊢ (¬α), then Σ ⊢ β.
….. By strong monotonicity, it suffices to show that Σ ∪ {α} ∪ {¬α} ⊢ β.
……….. (¬α)→((¬β)→(¬α)), Axiom 1
……….. ¬α, Assumption
……….. (¬β)→(¬α), MP
……….. ((¬β)→(¬α))→(α→β), Axiom 3
.………. α→β, MP
……….. α, Assumption
……….. β, MP

And finally, our most important background theorem:

Deduction Theorem: Σ ⊢ (α→β) iff Σ ∪ {α} ⊢ β.
Proof =>
….. Suppose Σ ⊢ (α→β).
….. By monotonicity, Σ ∪ {α} ⊢ (α→β).
….. Also, clearly Σ ∪ {α} ⊢ α.
….. So Σ ∪ {α} ⊢ β.
Proof <=
….. Suppose Σ ∪ {α} ⊢ β.
….. Base cases
………. β ∈ Axioms. (β, β→(α→β), α→β).
………. β ∈ Σ. (β, β→(α→β), α→β).
………. β = α. (⊢ (α→α), so by monotonicity Σ ⊢ (α→α)).
….. Inductive step
………. Suppose Σ ⊢ (α→γ) and Σ ⊢ (α→(γ→𝛿)).
………. By strong monotonicity, suffices to show Σ ∪ {α→γ, α→(γ→𝛿)} ⊢ (α→𝛿)
……………. (α→(γ→𝛿)) → ((α→γ)→(α→𝛿)), Axiom 2
……………. α→(γ→𝛿), Assumption
……………. (α→γ)→(α→𝛿), MP
……………. (α→γ), Assumption
……………. (α→𝛿). MP

Now, let’s go into the main body of the proof. The structure of the proof is actually quite similar to the proof of the compactness theorem I gave previously. First we show that every consistent set of sentences Σ has a maximally consistent extension Σ’. Then show that Σ’ is satisfiable. Now since Σ’ is satisfiable and it’s an extension of Σ, Σ must also be satisfiable. From there it’s a simple matter to show that the logic is complete.

So, let’s define some of the terms I just used.

Σ is consistent iff for no α does Σ ⊢ α and Σ ⊢ (¬α)
….. Equivalently: iff for some α, Σ ⊬ α

Σ is maximally consistent iff Σ is consistent and for every α, either Σ ∪ {α} is inconsistent or Σ ⊢ α.

One final preliminary result regarding consistency before diving into the main section of the proof:

If Σ is satisfiable, then Σ is consistent.
Proof
….. Suppose Σ is inconsistent.
….. Then there’s an α such that Σ ⊢ α and Σ ⊢ (¬α).
….. By extended soundness, Σ ⊨ α and Σ ⊨ (¬α).
….. So Σ is not satisfiable.

This is the converse of the result we actually want, but it’ll come in handy. Now, let’s begin to construct our extension!

Any consistent Σ can be extended to a maximally consistent Σ’
….. Choose any ordering {α_n} of well-formed-formulas.
….. Define Σ₀ = Σ.
….. Σ_n+1 = Σ_n if Σ_n ⊢ (¬α_n+1), and Σ_n ∪ {α_n+1} otherwise.
….. For each n, (i) Σ_n is consistent and (ii) either Σ_n ⊢ α_n or Σ_n ⊢ (¬α_n)
……….. Base case: Σ₀ is consistent by assumption, and (ii) doesn’t apply.
……….. Inductive step: Suppose Σ_n satisfies (i) and (ii). Two cases:
…………….. If Σ_n ⊢ (¬α_n+1), then Σ_n+1 = Σ_n. Clearly consistent and satisfies (ii).
…………….. If Σ_n ⊬ (¬α_n+1), then Σ_n+1 = Σ_n ∪ {α_n+1}. Clearly satisfies (ii), but is it consistent?
………………….. Suppose not. Then Σ_n+1 ⊢ (¬α_n+1), by explosion.
………………….. So Σ_n+1 ∪ {α_n+1} ⊢ (¬α_n+1).
………………….. So Σ_n+1 ⊢ (α_n+1 → (¬α_n+1)).
………………….. ⊢ ((α→¬α)→¬α), so Σ_n+1 ⊢ (¬α_n+1). Contradiction!

….. Define Σ’ = ∪ Σ_n. Σ’ is maximally consistent.
……….. Maximality
…………….. Suppose not. Then for some α_n, Σ’ ⊬ α_n and Σ’ ∪ {α_n} is consistent.
…………….. But Σ_n ⊆ Σ’, and either Σ_n ⊢ α_n or Σ_n ⊢ (¬α_n).
…………….. If Σ_n ⊢ α_n, by monotonicity Σ ⊢ α_n. Contradiction. So Σ_n ⊢ (¬α_n).
…………….. By monotonicity, Σ ⊢ (¬α_n), so Σ ∪ {α_n} ⊢ (¬α_n).
…………….. But Σ ∪ {α_n} ⊢ α_n. So Σ ∪ {α_n} is inconsistent. Contradiction!
……….. Consistency
…………….. Suppose Σ’ is inconsistent. Then for some α, Σ’ ⊢ α and Σ’ ⊢ (¬α).
…………….. So there are proofs of α and (¬α) from Σ’.
…………….. Proofs are finite, so each proof uses only a finite number of assumptions from Σ’.
…………….. So we can choose an n such that Σn contains all the needed assumptions.
…………….. Now both proofs from Σ’ are also proofs from Σ_n.
…………….. So Σ_n ⊢ α_n and Σ’ ⊢ (¬α_n).
…………….. So Σ_n is inconsistent. Contradiction!

Alright, we’re almost there! So now we have that for any consistent Σ, there’s an extension Σ’ that is maximally consistent. We’ll take it a little further and prove that not only is Σ’ maximally consistent, it’s also complete! (This is the purely syntactic sense of completeness, which is that for every sentence α, either Σ’ proves α or refutes α. This is different from the sense of logical completeness that we’re establishing with the proof.)

Σ’ is complete.
….. Σ_n ⊆ Σ’, and Σ_n ⊢ α_n or Σ_n ⊢ (¬α_n).
….. So by monotonicity Σ’ ⊢ α_n or Σ’ ⊢ (¬α_n).

Now we have everything we need to show that Σ’, and thus Σ, is satisfiable.

If Σ is consistent, then Σ is satisfiable.
Proof
….. Let Σ’ be a maximally consistent extension of Σ.
….. Define v_Σ’(p) over propositional variables p:
….. V_Σ’(p) = T if Σ’ ⊢ p and F if Σ’ ⊬ p
….. Ṽ_Σ’(α) = T iff Σ’ ⊢ α
……….. Base case: Let α be a propositional variable. Then Ṽ_Σ’(α) = T iff Σ’ ⊢ α by definition of V_Σ’.
……….. Inductive steps:
……….. (¬α)
…………….. If Σ’ ⊢ (¬α), then by consistency Σ’ ⊬ α, so Ṽ_Σ’(α) = F, so Ṽ_Σ’(¬α) = T.
…………….. If Σ’ ⊬ (¬α), then by completeness Σ’ ⊢ α. So Ṽ_Σ’(α) = T, so Ṽ_Σ’(¬α) = F.
……….. (α→β)
…………….. Suppose Σ’ ⊢ (α→β). By completeness Σ’ ⊢ α or Σ’ ⊢ (¬α).
………………….. If Σ’ ⊢ α, then Σ’ ⊢ β, so Ṽ_Σ’(β) = T, so Ṽ_Σ’(α→β) = T.
………………….. If Σ’ ⊢ (¬α), then Ṽ_Σ’(α) = F, so Ṽ_Σ’(α→β) = T.
……………. Suppose Σ’ ⊬ (α→β).
………………….. By completeness Σ’ ⊢ ¬(α→β).
………………….. ⊢ (β→(α→β)), so Σ’ ⊬ β on pain of contradiction. So ṼΣ'(β) = F.
………………….. Suppose Ṽ_Σ’(α→β) = T. Then Ṽ_Σ’(α) = F, so Σ’ ⊢ (¬α).
………………….. ⊢ (¬α→(α→β)). So Σ’ ⊢ (α→β). Contradiction.
………………….. So Ṽ_Σ’(α→β) = F.
….. So v_Σ’ satisfies Σ’.
….. Σ ⊆ Σ’, so v_Σ’ satisfies Σ.
….. So Σ is satisfiable!

Now our final result becomes a four-line proof.

If Σ ⊨ α, then Σ ⊢ α.
Proof
….. Suppose Σ ⊬ α.
….. Then Σ ∪ {¬α} is consistent.
….. So Σ ∪ {¬α} is satisfiable.
….. So Σ ⊭ α.

And we’re done! We’ve shown that if any sentence α is semantically entailed by a set of sentences Σ, then it must also be provable from Σ! If you’ve followed this proof all the way, pat yourself on the back.

With the Completeness Theorem in hand, the proof of the Compactness Theorem goes from several pages to a few lines. It’s so nice and simple that I just have to include it here.

If Σ is finitely satisfiable, then Σ is satisfiable.

….. Suppose Σ is not satisfiable.
….. Then Σ is not consistent.
….. So there is some α for which Σ ⊢ α and Σ ⊢ (¬α).
….. Since proofs are finite, there must be some finite subset Σ* of Σ such that Σ* ⊢ α and Σ* ⊢ (¬α).
….. By soundness, Σ* ⊨ α and Σ* ⊨ (¬α).
….. So Σ* is not satisfiable!

In other words, if Σ is not satisfiable, then there’s some finite subset of Σ that’s also not satisfiable. This is the Compactness Theorem! Previously we proved it entirely based off of the semantics of propositional logic, but now we can see that it is also provable as a consequence of the finite nature of our proof system!

Four Pre-Gödelian Limitations on Mathematics

February 22, 2020April 16, 2026 ~ ~ 3 Comments

Even prior to the devastating Incompleteness Theorems there were hints of what was to come. I want to describe and prove four results in mathematical logic that don’t depend on Incompleteness at all, but establish some rather serious limitations on the project of mathematics.

Here are the four. I’ll go through them in order of increasing level of sophistication required to prove them.

Indescribable sets of possible worlds
Noncompossibility theorem
Inevitable nonstandard numbers
Mysterious missing subsets

1. Indescribable sets of possible worlds

I already talked about this one here. The basic idea is that even in our safest and least troublesome logic, propositional calculus, it turns out that the language is insufficient to fully capture all the semantic notions. It’s the first hint at something going awry with syntax and semantics, where the semantics can outpace the syntax and leave axiomatic mathematics behind.

So to recap: the result is that in propositional logic there are sets of truth assignments that can not be “described” by any set of propositional sentences (even allowing infinite sets!). A set of sentences is said to “describe” a set of truth assignments if that set of truth assignments is the unique set of truth assignments consistent with all those sentences being true. If we think of truth assignments as possible worlds, and sets of sentences as descriptions of sets of possible worlds, then this result says that there are sets of possible worlds in propositional semantics that cannot be described by any propositional syntax.

The proof of this is astoundingly simple: just look at the cardinality of the set of descriptions and the cardinality of the set of sets of possible worlds. The second is strictly larger than the first, so any mapping from descriptions to sets of possible worlds will of necessity leave some sets of possible worlds out. In fact, it also tells us that virtually all sets of possible worlds are not describable!

2. Noncompossibility Theorem

The noncompossibility theorem is a little-known theorem that establishes a serious limitation on our ability to describe mathematical structures. Here’s what it says. Suppose that you have a description of a countably infinite structure (like, say, the natural numbers, which have a countable infinity of objects) that has the following three properties:

(1) The language has a term for denoting every object in the structure (like 0, 1, 2, 3, 4, and so on)
(2) The axioms in your description are weakly complete: if something is inconsistent with the axioms, it can be proven false.
(3) There is some algorithm for determining whether any given sentence is an axiom.

The noncompossibility theorem tells us that if you have all three of these properties, then your axioms will fail to uniquely pick out your intended structure, and will include models that have extra objects that aren’t in the structure.

Let’s prove this.

We’ll denote the mathematical structure that we’re trying to describe as M and our language as L. We choose L to have sufficient syntactic structure to express the truths of M. From L, we select a decidable set of sentences X with the goal that all these sentences be true of M. We now select a proof system F in L such that for any finite extension L* to L involving only new constant terms, and for any Y ⊆ L*, if X ∪ Y is not satisfiable, then F refutes some finite subset of X ∪ Y.

(As an aside: Why care about this strange weak form of completeness? Well, intuitively all that it’s saying is that our axioms should be able to rule out any set of sentences that are inconsistent with them using some finite proof, as long as those sentences only use finitely many additional constant symbols. This is relatively weak compared to the usual notions of completeness that logicians talk about, which makes it an even better choice for our purposes, as the weaker the axiom the harder to deny.)

Our assumptions can now be written:

(0) |M| is countably infinite.
(1) ∀m ∈ M, ∃t_m ∈ terms(X) such that (m = t_m)
(2) If we extend L to L* by adding finitely many constant symbols, then for any Y ⊆ L*, if X ∪ Y is not satisfiable, then F refutes some finite subset of X ∪ Y.
(3) X is recursively enumerable.

Our proof starts by adding a new constant term c to our language and constructing an extension of X:

Y = X ∪ {c ≠ t_m | m ∈ M}

In other words, Y is X but supplemented with the assertion that there exists an object that isn’t in M. If we can prove that Y is satisfiable, then this entails that X is also satisfiable by the same truth assignment. And this means that there is a model of X in which there are extra objects that aren’t in M.

We proceed with proof by contradiction. Suppose that Y is not satisfiable. Then, by assumption (2), we must be able to refute some finite subset Z of X ∪ Y. But since Z is finite, it involves only finitely many terms t_m. And since M is countably infinite, there will always be objects in M that are not equal to any of the chosen terms! So we can’t refute any finite subset of Z! Thus Y is satisfiable.

And if Y is satisfiable, then so must be X, as Y is a superset of X. And since Y is satisfiable, then there’s some truth assignment v that satisfies all of v. But then v also satisfies X, as X is a subset of Y and removing axioms cannot rule out models, only add more! So we’ve proven that X has a model in which there is an object that is not equal to any of the objects in M. That is, X is not categorical: it does not uniquely describe M.

Tennant described this theorem as saying that “in countably infinite realms, you cannot know both where you are and where you are going.” More dully, we cannot have a satisfactory theory of a countably infinite mathematical structure that is both categorical and weakly complete. This isn’t super shocking by today’s standards, but it’s quite cool when you consider how little elaborate theoretical apparatus is required to prove it.

3. Inevitable nonstandard numbers

Suppose we have some first-order theory T that models the natural numbers. Take this theory and append to it a new constant symbol c, as well as an infinite axiom schema saying “c > 0” , “c > 1” , “c > 2”, and so on forever. Call this new theory T*.

Does T* have a model? Well by the compactness theorem, it has a model as long as all its finite subsets have a model. And for every finite subset of T*s axioms, the natural numbers are a model! So T* does have a model. Could this model be the natural numbers? Clearly not, because to satisfy T*, there must be a number greater than all the natural numbers. So whatever the model of T* is, it’s not the standard natural numbers. Let’s call it a nonstandard model, and label it ℕ*.

Here’s the final step of the proof: ℕ* is a model of T*, and T* is a superset of T, so ℕ* must also be a model of T! And thus we find that in any logic with a compactness theorem, a theory of the natural numbers will have models with nonstandard numbers that are greater than all of ℕ.

It’s one of my favorite proofs, because it’s so easy to describe and has such a devastating conclusion. It’s also an example of the compactness theorem using the existence of one type of model (ℕ for each of the finite cases) to prove the existence of something entirely different (ℕ* for the infinite case).

4. Mysterious missing subsets

The Löwenheim-Skolem theorem tells us that if a first-order theory has a model with an infinite cardinality, then it has models with every infinite cardinality. This places a major restriction on our ability to describe an infinite mathematical structure using first order logic. For if we were to try to single out the natural numbers, say, we would inevitably end up failing to rule out models of our axioms that are the cardinality of the real numbers, or worse, or the set of functions from real numbers to real numbers, and so on for all other possible cardinalities.

When applied to set theory, this implies a result that seems on its face to be a straightforward contradiction. Namely, Löwenheim-Skolem tells us that any first order axiomatization of sets will inevitably have a model that contains only a countable infinity of sets. But this seems bizarre, as all we appear to need to rule out countably infinite universes of sets is one axiom that asserts the existence of a countably infinite set, and another that asserts that admits the power-set of any set to the universe of sets. Then we will be forced to admit that there is a set which is the power set of a countably infinite set, and as Cantor’s famous diagonal argument shows, that this set is uncountably large.

So on the one hand, Cantor tells us that there are sets that contain uncountably many objects. And on the other hand, Löwenheim-Skolem tell us that there is a model of set theory with only countably many objects. This dichotomy is known by the name Skolem’s paradox. It appears to be a straightforward contradiction, but it’s not.

What Skolem realized was that the formal notion of a power set, which is something like “the set P(X) such that for all sets Y, if Y is a subset of X, then Y is an element P(X)”, relies on a quantification over all sets, and that in a countable universe of sets, that quantification ranges over only a countable number of objects. In other words, P(X) is only uncountable if our quantifier ranges over all possible sets, but for a countable model, there are sets that are not describable within the model. This means that the notion of a power set is relative to your model of set theory! In fact, there’s no way in first order logic to unambiguously pin down what you mean by “power set” in such a way that all models will agree on what P(X) actually contains. It also means that the notions of cardinality and countability are relative to your model! In Skolem’s words, “even the notions ‘finite’, ‘infinite’, ’simply infinite sequence’ and so forth turn out to be merely relative within axiomatic set theory.”

A challenge to constructivists

February 17, 2020May 25, 2020 ~ ~ 6 Comments

Constructive mathematicians do not accept a proof of existence unless it provides a recipe for how to construct the thing whose existence is being asserted. Constructive mathematics is quite interesting, but it also appears to have some big problems. Here’s a challenge for constructivists:

Suppose that I hand you some complicated function f from a set A to another set B. I ask you: “Can every element in B be reached by applying the function to an element in A?” In other words, is f surjective?

Now, it so happens that the cardinality of B is greater than the cardinality of A. That’s sufficient to tell us that f can’t be surjective, as however it maps elements there will always be some left over. So we know that the answer is “no, we can’t reach every element in B.” But we proved this without explicitly constructing the particular element in B that can’t be reached! So a constructivist will be left unsatisfied.

The trick is that I’ve made this function extremely complicated, so that there’s no clever way for them to point to exactly which element is missing. Would they say that even though |B| is strictly larger than |A|, it could still be somehow that every element in B is in the image of f? Imagine asking them to bet on this proposition. I don’t think any sane person would put any money on the proposition that f is onto.

And as a final kicker, our sets don’t even have to be infinite! Let |A| = 20 and |B| = 21. I describe a function from A to B, such that actually computing the “missing element” involves having to calculate the 21st Busy Beaver number or something. And the constructivist gets busy searching for the particular element in B that doesn’t get mapped to, instead of just saying “well of course we can’t map 20 elements to 21 elements!”

Even simpler, let F map {1} to {1,2} as follows: F(1) = 1 if the last digit of the 20th busy beaver number is 0, 1, 2, 3, or 4, and F(1) = 2 otherwise. Now to prove constructively that there is an element in {1, 2} that isn’t in the image of F requires knowing the last digit of the 20th busy beaver number, which humans will most likely never be able to calculate (we’re stuck on the fifth one now). So a constructivist will be remain uncertain on the question of if F is surjective.

But a sane person would just say “look, of course F isn’t surjective; it maps one object to two objects. You can’t do this without leaving something out! It doesn’t matter if we don’t know which element is left out, it has to be one of them!”

And if humanity is about to meet an alien civilization with immense computational power that knows all the digits of the 20th busy beaver number, the standard mathematician could bet their entire life savings on F not being surjective at any odds whatsoever, and the constructive mathematician would bet in favor of F being surjective at some odds. And of course, the constructivist would be wrong and lose money! So this also means that you have a way to make money off of any constructivist mathematicians you encounter, so long as we’re about to make contact with advanced aliens.

Describing the world

February 16, 2020February 16, 2020 ~ ~ 1 Comment

Wittgenstein starts his Tractatus Philosophicus with the following two sentences.

1. The world is everything that is the case.

1.1 The world is the totality of facts, not of things.

Let’s take him up on this suggestion and see how far we get. In the process, we’ll discover some deep connections to theorems in mathematical logic, as well as some fascinating limitations on the expressive powers of propositional and first order logic.

We start out with a set of atomic propositions. For a very simple world, we might only need a finite number of these: “Particle 1 out of 3 has property 1 out of 50”, “Particle 2 of 3 has property 17 out of 50”, and so on. More realistically, the set of atomic propositions will be infinite (countable if the universe doesn’t have any continuous properties, and uncountable otherwise).

For simplicity, we’ll imagine labeling our set of atomic propositions P₁, P₂, P₃, and so on (even though this entails that there are at most countably many, nothing important will rest on this assumption.) We combine these atomic propositions with the operators of propositional logic {(, ), ¬, ∧, ∨, →}. This allows us to build up more complicated propositions, like ((P₇∧P₂)→(¬P₁₃)). This will be the language that we use to describe the world.

Now, the way that the world is is just a consistent assignment of truth values to the set of all grammatical sentences in our language. For example, one simple assignment of truth values is the one that assigns “True” to all atomic propositions. Once we’ve assigned truth values to all the atomic propositions, we get the truth values for the rest of the set of grammatical sentences for free, by the constraint that our truth assignment be consistent. (For instance, if P₁ and P₂ are both true, then (P₁∧P₂) must also be true.)

Alright, so the set of ways the world could be corresponds to the set of truth assignments over our atomic propositions. The final ingredient is the notion that we can encode our present knowledge of the world as a set of sentences. Maybe we know by observation that P₅ is true, and either P₂ or P₃ is true but not both. Then to represent this state of knowledge, we can write the following set of sentences:

{P₅, (P₂∨P₃), ¬(P₂∧P₃)}

Any set of sentences picks out a set of ways the world could be, such that each of these possible worlds is compatible with that knowledge. If you know nothing at all, then the set of sentences representing your knowledge will be the empty set {}, and the set of possible worlds compatible with your knowledge will be the set of all possible worlds (all possible truth assignments). On the other extreme, you might know the truth values of every atomic proposition, in which case your state of knowledge uniquely picks out one possible world.

In general, as you add more sentences to your knowledge-set, you cut out more and more possible worlds. But this is not always true! Ask yourself what the set of possible worlds corresponding to the set {(P₁∨¬P₁), (P₂∨¬P₂), (P₃∨¬P₃)} is. Since each of these sentences is a tautology, no possible worlds are eliminated! So our set of possible worlds is still the set of all worlds.

Now we get to an interesting question: clearly for any knowledge-set of sentences, you can express a set of possible worlds consistent with that knowledge set. But is it the case that for any set of possible worlds, you can find a knowledge-set that uniquely picks it out? If I hand you a set of truth assignment functions and ask you to tell me a set of propositions which are consistent with that set of worlds and ONLY that set, is that always possible? Essentially, what we’re asking is if all sets of possible worlds are describable.

We’ve arrived at the main point of this essay. Take a minute to ponder this and think about whether it’s possible, and why/why not! For clarification, each sentence can only be finitely long. But! You’re allowed to include an infinity of sentences.

(…)

(Spoiler-hiding space…)

(…)

If there were only a finite number of atomic propositions, then you could pick out any set of possible worlds with just a single sentence in conjunctive normal form. But when we start talking about an infinity of atomic propositions, it turns out that it is not always possible! There are sets of possible worlds that are literally not describable, even though our language includes the capacity to describe each of those words and we’re allowed to include an infinite set of sentences.

There’s a super simple proof of this. Let’s give a name to the cardinality of the set of sentences: call it K. (We’ve been tacitly acting as if the cardinality is countable this whole time, but that doesn’t actually matter.) What’s the cardinality of the set of all truth assignments?

Well, each truth assignment is a function from all sentences to {True, False}. And there are 2^K such assignments. 2^K is strictly larger than K, so there are more possible worlds than there are sentences. Now, the cardinality of the set of sets of sentences is also 2^K. But the set of SETS of truth of assignments is 2^{2^K}!

What this means is that we can’t map sets of sentences onto sets of truth assignments without leaving some things out! This proof carries over to predicate logic as well. The language for both propositional and predicate logic is unable to express all sets of possible worlds corresponding to that language!

I love this result. It’s the first hint in mathematical logic that syntax and semantics can come apart.

That result is the climax of this post. What I want to do with the rest of this post is to actually give an explicit example of a set of truth assignments that are “indescribable” by any set of sentences, and to prove it. Warning: If you want to read on, things will get a bit more technical from here.

Alright, so we’ll use a shortcut to denote truth assignments. A truth assignment will be written as a string of “T”s and “F”s, where the nth character corresponds to how the truth assignment evaluates P_n. So the all-true truth assignment will just be written “TTTTTT…” and the all-false truth assignment will be written “FFFFF…”. The truth assignment corresponding to P₁ being false and everything else true will be written “FTTTTT…”. And so on.

Now, here’s our un-describable set of truth assignments. {“FFFFFF…”, “TFFFFF…”, “TTFFFF…”, “TTTFFF…”, …}. Formally, define V_n to be the truth assignment that assigns “True” to every atomic proposition up to and including P_n, and “False” to all others. Now our set of truth assignments is just {V_n | n ∈ ℕ}.

Let’s prove that no set of sentences uniquely picks out this set of truth assignments. We prove by contradiction. Suppose that we could find a set of sentences that uniquely pick out these truth assignments and none other. Let’s call this set A. Construct a new set of sentences A’ by appending all atomic propositions A: A’ = A ∪ {P₁, P₂, P₃, …}.

Is there any truth assignment that is consistent with all of A’? Well, we can answer this by using the Compactness Theorem: A’ has a truth assignment if and only if every finite subset of A’ has a truth assignment. But every finite subset of A’ involves sentences from A (which are consistent with Vn for each n by assumption), and a finite number of atomic propositions. Since each finite subset of A’ is only asserting the truth of a finite number of atomic sentences, we can always find a truth assignment V_k in our set that is consistent with it, by choosing one that switches to “False” long after the last atomic proposition that is asserted by our finite subset.

This means that each finite subset of A’ is consistent with at least one of our truth assignments, which means that A’ is consistent with at least one of our truth assignments. But A’ involves the assertion that all atomic propositions are true! The only truth assignment that is consistent with this assertion is the all-true assignment! And is that truth assignment in our set? No! And there we have it, we’ve reached our contradiction!

We cannot actually describe a set of possible worlds in which either all atomic propositions are false, or only the first is true, or only the first two are true, or only the first three are true, and so on forever. But this might prompt the question: didn’t you just describe it? How did you do that, if it’s impossible? Well, technically I didn’t describe it. I just described the first four possibilities and then said “and so on forever”, assuming that you knew what I meant. To have actually fully pinned down this set of possible worlds, I would have had to continue with this sentence forever. And importantly, since this sentence is a disjunction, I could not split this infinite sentence into an infinite set of finite sentences. This fundamental asymmetry between ∨ and ∧ is playing a big role here: while an infinite conjunction can be constructed by simply putting each clause in the conjunction as a separate sentence, an infinite disjunction cannot be. This places a fundamental limit on the ability of a language with only finite sentences to describe the world.

The Surprise-Response Heuristic

February 13, 2020 ~ ~ 1 Comment

Often we judge if somebody else is understanding something that we do not understand by whether the things they say in response to our questions are surprising.

When somebody understands it about as well as you do, the things they say about it will generally be fairly understandable and expected (as they mesh with your current insufficient level of understanding). But if they actually understand it and you don’t, then you should expect to be surprised by the things they say, since you couldn’t have produced those responses yourself or predicted them coming.

Compare:

Q: “In this step of the proof, are they talking about extending the model or the language?”

A1: “They’re talking about extending the model. Look at the way that they worded the description of the extension in the previous step, it specifically describes adding a character to the model, not the language.”

A2: “No, they couldn’t be extending the model even though the wording suggests that, because then the proof wouldn’t even work; it’s required that we just change the language or else we end up working with a different model and failing to prove that the original model had the desired property. Also, it doesn’t even make sense to talk about adding a character to a model, the characters are a property of the language.”

Even with no context to judge whether the claims are true, I imagine that the second response feels much more convincing than the first, even though it’s probably less likely to be understood. The first is the type of response that is unsurprising and easy to see coming, and indicates only that the person is understanding the grammar of the English sentences they’re reading. It doesn’t strongly discriminate between a person that understands what’s going on and a person that doesn’t. The second is certainly surprising; it suggests that the person objects to the specific wording of the proof because of their understanding of the way it misrepresents the logical structure of the argument. They aren’t just comprehending the grammar, they are comprehending the actual content. Ordinarily, a person wouldn’t be able to off-the-cuff make up a response like that without actually understanding what’s going on.

This is a problem when people are good at saying surprising things without understanding. I’ve met a few people that are very good “contrarians”; they are good at coming up with strange and creative ways to say things that ultimately shed very little insight on the topic at hand. I often found myself in a weird position with such people where I feel like they understand the topic at hand better than me, and yet simultaneously I’m deeply suspicious of every word coming out of their mouth.

There’s a problem with infinity

November 4, 2019April 16, 2026 ~ ~ 3 Comments

Last post I described the Ross-Littlewood paradox, in which an ever-expanding quantity of numbered billiard balls are placed into a cardboard box in such a way that after an infinite number of steps the box ends up empty. Here’s a version of this paradox:

Process 1
Step 1: Put 1 through 9 into the box.
Step 2: Take out 1, then put 10 through 19 into the box.
Step 3: Take out 2, then put 20 through 29 into the box.
Step 4: Take out 3, then put 30 through 39 into the box.
And so on.

Box contents after each step
Step 1: 1 through 9
Step 2: 2 through 19
Step 3: 3 through 29
Step 4: 4 through 39
And so on.

Now take a look at a similar process, where instead of removing balls from the box, we just change the number that labels them (so, for example, we paint a 0 after the 1 to turn “Ball 1” to “Ball 10″).

Process 2
Step 1: Put 1 through 9 into the box
Step 2: Change 1 to 10, then put 11 through 19 into the box.
Step 3: Change 2 to 20, then put 21 through 29 in.
Step 3: Change 3 to 30, then put 31 through 39 in.
And so on.

Box contents after each step
Step 1: 1 through 9
Step 2: 2 through 19
Step 3: 3 through 29
Step 4: 4 through 39
And so on.

Notice that the box contents are identical after each step. If that’s all that you are looking at (and you are not looking at what the person is doing during the step), then the two processes are indistinguishable. And yet, Process 1 ends with an empty box, and Process 2 ends with infinitely many balls in the box!

Why does Process 2 end with an infinite number of balls in it, you ask?

Process 2 ends with infinitely many balls in the box, because no balls are ever taken out. 1 becomes 10, which later becomes 100 becomes 1000, and so on forever. At infinity you have all the natural numbers, but with each one appended an infinite number of zeros.

So apparently the method you use matters, even when two methods provably get you identical results! There’s some sort of epistemic independence principle being violated here. The outputs of an agent’s actions should be all that matters, not the specific way in which the agent obtains those outputs! Something like that.

Somebody might respond to this: “But the outputs of the actions aren’t the same! In Process 1, each step ten are added and one removed, whereas in Process 2, each step nine are added. This is the same with respect to the box, but not with respect to the rest of the universe! After all, those balls being removed in Process 1 have to go somewhere. So somewhere in the universe there’s going to be a big pile of discarded balls, which will not be there in Process 2.

This responds holds water as long as our fictional universe doesn’t violate conservation of information, as if not, these balls can just vanish into thin air, leaving no trace of their existence. But that rebuttal feels cheap. Instead, let’s consider another variant that gets at the same underlying problem of “relevance of things that should be irrelevant”, but avoids this problem.

Process 1 (same as before)
Step 1: Put 1 through 9 into the box.
Step 2: Take out 1, then put 10 through 19 into the box.
Step 3: Take out 2, then put 20 through 29 into the box.
Step 4: Take out 3, then put 30 through 39 into the box.
And so on.

Box contents after each step
Step 1: 1 through 9
Step 2: 2 through 19
Step 3: 3 through 29
Step 4: 4 through 39
And so on.

And…

Process 3
Step 1: Put 1 through 9 into the box.
Step 2: Take out 9, then put 10 through 19 into the box.
Step 3: Take out 19, then put 20 through 29 into the box.
Step 4: Take out 29, then put 30 through 39 into the box.
And so on.

Box contents after each step
Step 1: 1 through 9
Step 2: 1 to 8, 10 to 19
Step 3: 1 to 8, 10 to 18, 20 to 29
Step 4: 1 to 8, 10 to 18, 20 to 28, 30 to 39
And so on

Okay, so as I’ve written it, the contents of each box after each step are different in Processes 1 and 3. Just one last thing we need to do: erase the labels on the balls. The labels will now just be stored safely inside our minds as we look over the balls, which will be indistinguishable from one another except in their positions.

Now we have two processes that look identical at each step with respect to the box, AND with respect to the external world. And yet, the second process ends with an infinite number of balls in the box, and the first with none! (Every number that’s not one less than a multiple of ten will be in there.) It appears that you have to admit that the means used to obtain an end really do matter.

But it’s worse than this. You can arrange things so that you can’t tell any difference between the two processes, even when observing exactly what happens in each step. How? Well, if the labelling is all in your heads, then you can switch around the labels you’ve applied without doing any harm to the logic of the thought experiment. So let’s rewrite Process 3, but fill in both the order of the balls in the box and the mental labelling being used:

Process 3
Start with:
1 2 3 4 5 6 7 8 9
Mentally rotate labels to the right:
9 1 2 3 4 5 6 7 8
Remove the furthest left ball:
1 2 3 4 5 6 7 8
Add the next ten balls to the right in increasing order:
1 2 3 4 5 6 7 8 10 11 12 13 14 15 16 17 18 19
Repeat!

Compare this to Process 1, supposing that it’s done without any relabelling:

Process 1
Start with:
1 2 3 4 5 6 7 8 9
Remove the furthest left ball:
2 3 4 5 6 7 8 9
Add the next tell balls to the right in increasing order:
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 19
Repeat!

If the labels are all in your head, then these two processes are literally identical except for how a human being is thinking about them.

But looking at Process 3, you can prove that after Step 1 there will always be a ball labelled 1 in the box. Same with 2, 3, 4, and all other numbers that are not a multiple of 10 minus one. Even though we remove an infinity of balls, there are ball numbers that are never removed. And if we look at the pile of discarded balls, we’ll see that it consists of 9, 19, 29, 39, and so on, but none of the others. Unless some ball numbers vanish in the process (which they never do!), all the remainders must still be sitting in the box!

So we have two identical-in-every-relevant-way processes, one of which ends with an infinite number of balls in the box and the other with zero. Do you find this troubling? I find this very troubling. If we add some basic assumption that an objective reality exists independent of our thoughts about it, then we’ve obtained a straightforward contradiction.

✯✯✯

Notice that it’s not enough to say “Well, in our universe this process could never be completed.” This is for two reasons:

First of all, it’s actually not obvious that supertasks (tasks involving the completion of an infinite number of steps in a finite amount of time) cannot be performed in our universe. In fact, if space and time are continuous, then every time you wave your hand you are completing a sort of supertask.

You can even construct fairly physically plausible versions of some of the famous paradoxical supertasks. Take the light bulb that blinks on and off at intervals that get shorter and shorter, such that after some finite duration it has blinked an infinity of times. We can’t say that the bulb is on at the end (as that would seem to imply that the sequence 0101010… had a last number) or that it is off (for much the same reason). But these are the only two allowed states of the bulb! (Assume the bulb is robust against bursting and all the other clever ways you can distract from the point of the thought experiment.)

Now, here’s a variant that seems fairly physically reasonable:

A ball is dropped onto a conductive plate that is attached by wire to a light bulb. The ball is also wired to the bulb, so that when the ball contacts the plate, a circuit is completed that switches the light bulb on. Each bounce, the ball loses some energy to friction, cutting its velocity exactly in half. This means that after each bounce, the ball hangs in the air for half as long as it did the previous bounce.

Suppose the time between the first and second bounce was 1 second. Then the time between the second and third will be .5 seconds. And next will be .25 seconds. And so on. At 2 seconds, the ball will have bounced an infinite number of times. So at 2 seconds, the light bulb will have switched on and off an infinite number of times.

And of course, at 2 seconds the ball is at rest on the plate, completing the circuit. So at 2 seconds, upon the completion of the supertask, the light will be on.

Notice that there are no infinite velocities here, or infinite quantities of energy. Just ordinary classical mechanics applied to a bouncing ball and a light bulb. What about infinite accelerations? Well even that is not strictly speaking necessary; we just imagine that each velocity reversal takes some amount of time, which shrinks to zero as the velocity shrinks to zero in such a way as to keep all accelerations finite and sum to a finite total duration.

All this is just to say that we shouldn’t be too hasty in dismissing the real-world possibility of apparently paradoxical supertasks.

But secondly, and more importantly, physical possibility is not the appropriate barometer of whether we should take a thought experiment seriously. Don’t be the person that argues that the fat man wouldn’t be sufficient to stop a trolley’s momentum. When we find that some intuitive conceptual assumptions lead us into trouble, the takeaway is that we need to closely examine and potentially revise our concepts!

Think about Russell’s paradox, which showed that some of our most central intuitions about the concept of a set lead us to contradiction. Whether or not the sets that Bertie was discussing can be pointed to in the physical world is completely immaterial to the argument. Thinking otherwise would have slowed down progress in axiomatic set theory immensely!

These thought experiments are a problem if you believe that it is logically possible for there to be a physical universe in which these setups are instantiated. That’s apparently all that’s required to get a paradox, not that the universe we live in happens to be that one.

So it appears that we have to conclude some limited step in the direction of finitism, in which we rule out a priori the possibility of a universe that allows these types of supertasks. I’m quite uncomfortable with this conclusion, for what it’s worth, but I don’t currently see a better option.

A Supertask Puzzle

November 3, 2019April 16, 2026 ~ ~ 1 Comment

The Puzzle

You have in front of you an empty box. You also have on hand an infinite source of billiard balls, numbered 0, 1, 2, 3, 4, and so on forever.

At time zero, you place balls 0 and 1 in the box.

In thirty minutes, you remove ball 0 from the box, and place in two new balls (2 and 3).

Fifteen minutes after that, you remove ball 1 from the box, and place in two new balls (4 and 5).

7.5 minutes after that, you remove ball 2 and place in balls 6 and 7.

And so on.

Untitled document (3)

After an hour, you will have taken an infinite number of steps. How many billiard balls will be in the box?

✯✯✯

At time zero, the box contains two balls (0 and 1). After thirty minutes, it contains three (1, 2, and 3). After 45 minutes, it contains four (2, 3, 4, and 5). You can see where this is going…

Naively taking the limit of this process, we arrive at the conclusion that the box will contain an infinity of balls.

But hold on. Ask yourself the following question: If you think that the box contains an infinity of balls, name one ball that’s in there. Go ahead! Give me a single number such that at the end of this process, the ball with that number is sitting in the box.

The problem is that you cannot do this. Every single ball that is put in at some step is removed at some later step. So for any number you tell me, I can point you to the exact time at which that ball was removed from the box, never to be returned to it!

But if any ball that you can name can be proven to not be in the box.. and every ball you put in there was named… then there must be zero balls in the box at the end!

In other words, as time passes and you get closer and closer to the one-hour mark, the number of balls in the box appears to be growing, more and more quickly each moment, until you hit the one-hour mark. At that exact moment, the box suddenly becomes completely empty. Spooky, right??

Let’s make it weirder.

What if at each step, you didn’t just put in two new balls, but one MILLION? So you start out at time zero by putting balls 0, 1, 2, 3, and so on up to 1 million into the empty box. After thirty minutes, you take out ball 1, but replace it with the next 1 million numbered balls. And at the 45-minute mark, you take out ball 2 and add the next 1 million.

What’ll happen now?

Well, the exact same argument we gave initially applies here! Any ball that is put in the box at any point, is also removed at a later point. So you literally cannot name any ball that will still be in the box after the hour is up, because there are no balls left in the box! The magic of infinity doesn’t care about how many more balls you’ve put in than removed at any given time, it still delivers you an empty box at the end!

Now, here’s a final variant. What if, instead of removing the smallest numbered ball each step, you removed the largest numbered ball?

So, for instance, at the beginning you put in balls 0 and 1. Then at thirty minutes you take out ball 1, and put in balls 2 and 3. At 45 minutes, you take out ball 3, and put in balls 4 and 5. And so on, until you hit the one hour mark. Now how many balls are there in the box?

Infinity! Why not zero like before? Well, because now I can name you an infinity of numbers whose billiard balls are still guaranteed to be in the box when the hour’s up. Namely, 0, 2, 4, 6, and all the other even numbered balls are still going to be in there.

Take a moment to reflect on how bizarre this is. We removed the exact same number of balls each step as we did last time. All that changed is the label on the balls we removed! We could even imagine taking off all the labels so that all we have are identical plain billiard balls, and just labeling them purely in our minds. Now apparently the choice of whether to mentally label the balls in increasing or decreasing order will determine whether at the end of the hour the box is empty or packed infinitely full. What?!? It’s stuff like this that makes me sympathize with ultrafinitists.

One final twist: what happens if the ball that we remove each step is determined randomly? Then how many balls will there be once the hour is up? I’ll leave it to you all to puzzle over!

Are the Busy Beaver numbers independent of mathematics?

October 22, 2019October 22, 2019 ~ ~ 4 Comments

A few years ago, Scott Aaronson and a student of his published this paper, in which they demonstrate the existence of a 7918 state Turing machine whose behavior is independent of ZFC. In particular, whether the machine halts or not can not be proven by ZFC. This entails that ZFC cannot prove the value of BB(7918) – the number of steps taken by the longest running Turing machine with 7918 states before halting. And since ZFC is a first order theory and first order logic is complete, the unprovability of the value entails that BB(7918) actually has different values in different models of the axioms! So ZFC does not semantically entail its value, which is to say that ZFC underdetermines the Busy Beaver numbers!

This might sound really surprising. After all, the Busy Beaver numbers are a well-defined sequence. There are a finite number of N-state Turing machines, some subset of which are finitely-running. Just look at the number of steps that the longest-running of these goes for, and that’s BB(N). It’s one thing to say that this value is impossible to prove, but what could it mean for this value to be underdetermined by the standard axioms of math? Are there some valid versions of math in which this machine runs for different amounts of time than others? But how could this be? Couldn’t we in principle build the Turing machine in the real world and just observe exactly how long it runs for? And if we did this, then we certainly shouldn’t expect to get an indeterminate answer. So what gives?

Well, first of all, the existence of a machine like Aaronson and Yedidia’s is actually not a surprise. For any consistent theory T whose axioms are recursively enumerable, one can build a Turing machine M that enumerates all the syntactic consequences of the axioms and halts if it ever finds a contradiction. That is, M simply starts with the axioms, and repeatedly applies modus ponens and the other inference rules of T’s logic until it reaches a contradiction. Now, if T is strong enough to talk about the natural numbers, then it cannot prove whether or not M halts. This is a result of Gödel’s Second Incompleteness Theorem: If T could prove the behavior of M, then it could prove its own consistency, which would entail that it is inconsistent. This means that no consistent formal theory will be capable of proving all the values of the Busy Beaver numbers; for any theory T there will always be some number N for which the value of BB(N) is in principle impossible to derive from T.

On the other hand, this does not entail that the Busy Beaver numbers do not have definite values. This misconception arises from two confusions: (1) independence and unprovability are not the same thing, and (2) independence does not necessarily mean that there is no single right answer.

On (1): A proposition P is independent of T if there are models of T in which P is true and other models in which it is false. P is unprovable from T if… well, if it can’t be proved from the axioms of T. Notice that independence is a semantic concept (having to do with the different models of a theory), while unprovability is a syntactic one (having only to do with what you can prove using the rules of syntax in T). Those two are equivalent in first order logic, but only because it’s a complete logic: Anything that’s true in all models of a first-order theory is provable from its axioms, so if you can’t prove P from T’s axioms, then P cannot be true in all models; i.e. P is independent. Said another way, first-order theories’ semantic consequences are all also syntactic consequences.

But this is not so in second-order logic! In a second-order theory T, X can be unprovable from T but still true in all models of T. There is a gap between the semantic and the syntactic, and therefore there is a corresponding gap between independence and unprovability.

So while it’s true that the Busy Beaver numbers are independent of any first-order theory you choose, it’s not true that the Busy Beaver numbers are independent of any second-order theory that you choose. We can perfectly well believe that all the Busy Beaver numbers have unique values, which are fixed by some set of second-order axioms, and we just cannot derive the values from these axioms.

And on (2): Even the independence of the Busy Beaver numbers from any first order theory is not necessarily so troubling. We can just say that the Busy Beaver numbers do have unambiguous values, it’s just that due to first-order logic’s expressive limitations, we cannot pin down exactly the model that we want.

In other words, if BB(7918) is 𝑥 in one model and 𝑥+1 in another, this does not have to mean that there is some deep ambiguity in the value of BB(7918). It’s just that only one of the models of your theory is the intended model, the one that’s actually talking about busy beaver numbers and Turing machines, and the other models are talking about some warped version of these concepts.

Maybe this sounds a little fishy to you. How do we know which model is the “correct” one if we can’t ever rule out its competitors?

Well, the inability of first order logic to get rid of these nonstandard models is actually basic feature of pretty much any mathematical theory. In first-order Peano arithmetic, for instance, we find that we cannot rule out models that contain an uncountable number of “natural numbers”. But we don’t then say that we do not know for sure whether or not there are an uncountable infinity of natural numbers. And we certainly don’t say that the cardinality of the set of natural numbers is ambiguous! We just say that unfortunately, first order logic is unable to rule out those uncountable models that don’t actually represent natural numbers.

If this is an acceptable response here, if you find it tempting to say that the inability of first order theories of arithmetic to pin down the cardinality of the naturals tells us nothing whatsoever about the natural numbers’ actual cardinality, then it should be equally acceptable to say of the Busy Beaver numbers that the independence of their values from any given mathematical theory tells us nothing about their actual values!