Finiteness can’t be captured in a sound, complete, finitary proof system

Consider the sentence “This blog has finitely many posts.” Do you understand what this sentence means? Then no set of rules (even infinite, even uncomputable!) can model your reasoning. This claim may sound shocking, but it can be justified on solid metamathematical grounds.

Another example: the sentence “There are finitely many planets in the universe.” You don’t have to think it’s true, you just have to think you understand what it means. What’s the common theme? It’s the notion of there being ‘finitely many’ of some class of objects. Let’s imagine building a language that has the expressive resources of first-order logic (which are quite modest), plus an additional quantifier F, whose semantics are given by the following rule: Fx φ(x) is satisfied by a model iff there are only finitely many objects in that model that satisfy φ(x).

It turns out that any language consisting of first order logic plus the quantifier F can’t be axiomatized in any sound, complete, and finitary proof system. Notice that effective enumerability of the rules of the proof system is not a requirement here! So long as the language is strong enough to express the semantics of {∧, ¬, ∀, F, variables xn, and relations Rn}, no set of sentences and sentence-manipulation rules in that language will suffice to capture these semantics.

Here’s a proof: consider the first-order theory of Peano arithmetic. This theory has nonstandard models (as any theory of arithmetic must have in a logic that is compact). All of these nonstandard models have the following feature: that there are numbers that are larger than infinitely many numbers. Think about what this means: this is a common feature of all nonstandards, so if we could write a sentence to express this feature then we could rule them all out! This is where the quantifier F steps in. With F, we can write the following simple sentence and add it to PA as an axiom:

∀x Fy (y < x)

In English: every number has finitely many numbers less than it. And with that, we’ve ruled out all nonstandard models! So now we have a theory that is categorical for ℕ. And that’s a big deal, metamathematically speaking!

Why? Well, as I’ve talked about in a previous post, you can prove some pretty strong limitative results about any logic that can produce a theory that’s categorical for ℕ. In particular, if we can produce such a theory then its logic cannot be compact. Quick proof: suppose a set of sentences Σ models ℕ. Add to Σ a constant c and the axioms “c ≠ 0”, “c ≠ 1”, “c ≠ 2”, and so on, and call this new set Σ’. Every finite subset of Σ’ models ℕ. So by compactness, Σ’ has a model. But this model is nonstandard – it contains numbers that aren’t natural numbers. And since Σ is a subset of Σ’, any model of Σ’ is also a model of Σ.

So compactness implies that no theory is categorical for ℕ. But compactness follows from the following three properties: a sound and complete proof system (Σ ⊢ α if and only if Σ ⊨ α), and that all proofs are only finitely long (try expressing this property without F!). Quick proof: If a set of sentences is finitely satisfied, then every finite subset of it has a model (by definition), so no finite subset of it can be refuted (by soundness), so the entire set can’t be refuted (by finite proofs), so the entire set is satisfied (by completeness).

So soundness + completeness + finiteness ⇒ compactness ⇒ the existence of nonstandard models of arithmetic in any theory that models ℕ. Which means that the semantics of F cannot be captured in any sound, complete, and finite proof system!

Take your pick: either you don’t really understand the semantics of the “finitely many” quantifier F, or no set of rules (not even requiring this set to be finite or computable) can fully capture your reasoning in finite-length proofs.

More information about related extensions of first-order logic and their limitations can be found here. The result I describe here is a rephrasing of results discussed there.

The Mindblowing Goodstein Sequences

Goodstein sequences are amazingly strange and unintuitive. I want to tell you all their unusual properties up-front, but I also don’t want to spoil the surprise. So let me start by defining the Goodstein sequence of a number.

We start by defining the hereditary base-n representation of a number m. We first write m as a sum of powers of n (e.g. for m = 266 and n = 2, we write 266 = 28 + 23 + 21). Now we write each exponent as a sum of powers of n. (So now we have 266 = 223 + 22+1 + 21). And we continue this until all numbers in the representation are ≤ n. For 266, our representation stabilizes at 266 = 222+1 + 22+1 + 21.

Now we define Gn(m) as follows: if m = 0, then Gn(m) = 0. Otherwise, Gn(m) is the number produced by replacing every n in the hereditary base-n representation of m by n+1 and subtracting 1. So G2(266) = G2(222+1 + 22+1 + 21) = 333+1 + 33+1 + 31 – 1 ≈ 4.4 × 1038.

Finally, we define the Goodstein sequence for a number n as the following:

a0 = n
a1 = G2(a0)
a2 = G3(a1)
a3 = G4(a2)
And so on.

Let’s take a small starting number and look at its Goodstein sequence. Suppose we start with 4.

a0 = 4 = 22
a1 = 33 – 1 = 2⋅32 + 2⋅3 + 2 = 26
a2 = 2⋅42 + 2⋅4 + 1 = 41
a3 = 2⋅52 + 2⋅5 = 60
a4 = 2⋅62 + 2⋅6 – 1 = 2⋅62 + 6 + 5 = 83
a5 = 2⋅72 + 7 + 4 = 109

Now, notice that the sequence always seems to increase. And this makes sense! After all, each step we take, we increase the base of each exponent by 1 (a big increase in the value) and only subtract 1. Here’s a plot of the first 100 values of the Goodstein sequence of 4:

And for further verification, here are the first million values of the Goodstein sequence of 4:

Looks like exponential growth to me! Which, again, is exactly what you’d expect. In addition, the higher the starting value, the more impressive the exponential growth we see. Here we have the first 100 values of the Goodstein sequence of 6:

Notice how after only 100 steps, we’re already at a value of 5 × 1010! This type of growth only gets more preposterous for larger starting values. Let’s look at the Goodstein sequence starting with 19.

a0 = 19
a1 ≈ 7 × 1012
a2 ≈ 1.3 × 10154
a3 ≈ 1.8 × 102184
And so on.

We can’t even plot a sequence like this, as it’ll just look like two perpendicular lines:

Alright, so here’s a conjecture that hopefully you’re convinced of: Goodstein sequences are always growing. In fact, let’s conjecture something even weaker: Goodstein sequences never terminate (i.e. never go to zero).

We’re finally ready for our first big reveal! Both of these conjectures are wrong! Not only do some Goodstein sequences terminate, it turns out that EVERY Goodstein sequence eventually terminates!

This is pretty mind-blowing. What about all those graphs I showed? Well, the sequences do increase for a very very long time, but they eventually turn around. How long is eventually? The Goodstein sequence of 4 takes 3 × 2402,653,210 – 1 steps to turn around!! This number is too large for me to be able to actually show you a plot of the sequence turning around. But we do know that once the Goodstein sequence stops increasing, it actually stays fixed for 3 × 2402,653,209 steps before finally beginning its descent to zero. Heuristically, it would look something like this:

(Why does it stay constant, incidentally? The reason is that when we get to a number whose hereditary base-B notation just looks like B + n for some n < B, we increase B by 1 and decrease n by 1 each step. This doesn’t change the sequence’s value! So it stays constant until we get to n = 0. At that point, we decrease by 1 until we descend all the way to zero.)

So, how do we prove this? We do it using infinite ordinals. (Now you see why this topic caught my eye recently!) The plan is to associate with Goodstein sequence a “matching sequence” of ordinals. The way to do this is pretty simple: for a number written in hereditary base-n notation, just replace every n in that representation with ω. Now you have an infinite ordinal!

So, for example, here’s the first few numbers in the Goodstein sequence of 4, along with their translation to ordinals:

4 = 22 becomes ωω
26 = 33 – 1 = 2⋅32 + 2⋅3 + 2 becomes ω2⋅2 + ω⋅2 + 2
41 = 2⋅42 + 2⋅4 + 1 becomes ω2⋅2 + ω⋅2 + 1
60 = 2⋅52 + 2⋅5 becomes ω2⋅2 + ω⋅2
83 = 2⋅62 + 2⋅6 – 1 = 2⋅62 + 6 + 5 becomes ω2⋅2 + ω + 5
109 = 2⋅72 + 7 + 4 becomes ω2⋅2 + ω + 4

Examine this sequence of ordinals for a couple of minutes. You might notice something interesting: at each step, the ordinal is decreasing! This turns out to always hold true. Let’s see why.

Notice that there are two cases: either the ordinal is a successor ordinal (like ω2 + 4), or it’s a limit ordinal (like ω2 + ω). If it’s a successor ordinal, then the next ordinal in the sequence is just its predecessor. (So ω2 + 4 becomes ω2 + 3). This is obviously a smaller ordinal.

What if it’s a limit ordinal? Then the smallest limit ordinal in its expanded representation drops to a smaller limit ordinal and some finite number is added. So for instance, in the ordinal ω2 + ω⋅2, ω⋅2 will drop to ω and some finite number will be added on depending on which step in the sequence you’re at. So we’ll end up with ω2 + ω + n, for some finite n. This ordinal is smaller than the original one, because we’ve made an infinitely large jump downwards (from one limit ordinal to a lower limit ordinal), and only increased by a finite amount.

So in either case, we go to a smaller ordinal. And now the key realization is that every decreasing sequence of ordinals MUST terminate! (This is one of my favorite facts about ordinals, by the way. Even though we’re looking at arbitrarily large infinite ordinals, it’s still the case that there is no decreasing sequence that goes on forever. Whenever we hit a limit ordinal, we have to make an infinitely large jump downwards to get to the next element in the sequence, as the ordinal has no predecessor.)

So we’ve paired every Goodstein sequence with a decreasing sequence of ordinals. And every decreasing sequence of ordinals eventually terminates. So the Goodstein sequence must terminate as well! (One easy way to see this is to notice that every value in a Goodstein sequence is ≤ the ordinal assigned to it. Another way is to notice that the only number that could possibly be assigned to the ordinal 0 is 0 itself. So when the decreasing sequence of ordinals hits 0, as it must eventually, so must the Goodstein sequence) And that’s our proof!

That Goodstein sequences always terminate is the big flashy result for this post. But there’s much more that’s cool about them. For instance, the proof that we just used did not rely on the fact that we just increased the base by 1 at each step. In fact, even if we update the base by an arbitrarily large function at each stage, the sequences still must terminate!

Even cooler, the proof that all Goodstein sequences terminate cannot be formalized in Peano arithmetic! And in fact, no proof can be found of this theorem using PA. Laurie Kirby and Jeff Paris proved in 1982 that the statement that Goodstein sequences always terminate could be reduced to a theorem of Gentzen from which the consistency of PA could be deduced. So if PA could prove that Goodstein sequences always terminate, then it could prove its own consistency! Gödel’s second incompleteness theorem tells us that this cannot be, and thus assuming that PA really is consistent, it cannot prove that that these sequences all terminate.

This was historically the third result to be found independent of PA, and it was the first that was purely number-theoretic in character (as opposed to meta-mathematical, like the Gödel sentences). It gives us a very salient way of expressing the limitations of Peano arithmetic: simply write a program that computes the Goodstein sequence of any given input, terminating only when it reaches 0 and outputting the length of the sequence. (Code Golf Stack Exchange shows that this can be done using just 77 characters of Haskell! And here’s a nice way to do it in Python). Now, it happens to be true that this program will always terminate for any natural number input (if run on sufficiently powerful hardware). But Peano arithmetic cannot prove this!

As a final fun suprise, let’s define G(n) to be the length of the Goodstein sequence starting at n. As we’ve already seen, this sequence gets really big really quickly. But exactly how quickly does it grow? Turns out that it grows much much faster than other commonly known fast-growing computable functions. For instance, G(n) grows much faster than Ackermann’s function, and G(12) is already greater than Graham’s number!

Wacky Non-Standard Models of PA

Peano arithmetic is unable to pin down the natural numbers, despite its countable infinity of axioms. In fact, assuming its consistency Peano arithmetic has models of every cardinality, meaning that as far as PA is aware, there might be uncountably many real numbers. (If PA is not consistent then it has no models.) I want to take a look at these non-standard models of PA, especially the countable ones. A natural question is, how many countable non-standard models are there alongside the standard models?

Assuming the consistency of PA (which I will leave out from now on, as it’s assumed in all that follows), there are continuum many non-isomorphic countable models. That’s a lot! That means that there’s a distinct countable non-standard model of arithmetic for every real number. This is our first result: the number of nonstandard models of PA of cardinality ℵ0 is 20. Interestingly, this result generalizes! For every infinite cardinal κ, there are 2κ non-isomorphic nonstandard models of cardinality κ. That’s a lot of nonstandard models! In fact, since any model of cardinality κ involves a specification of some number of constants, binary relations over κ and functions from κ to κ, we know that the maximum number of models of cardinality κ is 2κ. So in this sense, Peano arithmetic has as many nonstandard models of each cardinality as it can have!

Let’s take a closer look at the countable non-standard models. It turns out that we can say a lot about their order type. Namely, all countable non standard models of PA have order type ω + (ω*+ω)·η. What does this notation mean? Let me break down each of the order types involved in that formula:

  • ω: order type of the naturals
  • ω*: order type of the negative integers
  • η: order type of the rationals

So ω*+ω is the order type of the integer line, and (ω*+ω)·η is the order type of a structure that resembles an integer line for each rational number. Thus, every countable non-standard model of PA looks like a copy of natural numbers followed by as many copies of integers as there are rational numbers. The order on this structure is lexicographic: two nonstandards on the same integer line are judged according to their position on the integer line, and two nonstandards on different integer lines are judged according to the position of these integer lines on the rational line. It’s not the easiest thing to visualize, but here’s my attempt:

Visualization of the order type of countable nonstandard models of Peano arithmetic

So if all countable non-standard models have the same order type, then where do they differ? It turns out that they differ in the details of how addition and multiplication work on the non-standards. (After all, a model of PA is defined by the size of its universe and its interpretation of ≤, +, and ×. Same order type means same ≤, so what’s left to vary is + and ×.) In each of the models, + and × work exactly like normal on the naturals. And + and × must operate on the non-standards in such a way as to maintain the truth of all the axioms of PA. So, for instance, since PA can prove that 2x = x + x, the same must be true for nonstandard x. And so on. But even given these restrictions, there are still uncountably many ways to define + and × on the non-standards.

What’s more, we have a theorem known as the Tennenbaum Theorem, which tells us that it’s impossible to give recursive definitions to + and × in non-standard models. Said more simply, addition and multiplication are uncomputable in every non-standard model of arithmetic!

One thing I remain unsure of is how the nonstandard models of Peano arithmetic compare to the structure of ω in nonstandard models of ZFC. We know that there must be models of ZFC where ω is nonstandard, by the compactness theorem (define ZFC* to be ZFC with an extra constant c, with the extra axioms “c ∈ ω”, “c ≠ 0”, “c ≠ 1”, “c ≠ 2”, and so on. ZFC* has a model by compactness, and this model is also a model of ZFC by monotonicity.) But is ZFC “better” at ruling out nonstandard models of the naturals than PA is? Or is there a nonstandard ω for every nonstandard model of Peano arithmetic?

Constructing the Church-Kleene Ordinal

I want to share a way to construct the order type of the Church-Kleene ordinal ω1CK. As far as I’m aware, this construction is original to me, and I think it’s much simpler and more elegant than others that I’ve found on the internet. Please let me know if you see any issues with it!

So, we start with the notion of a computable ordinal. A computable ordinal is a computable well-order on the natural numbers. In other words, it’s a well-order < on ℕ such that there exists a Turing machine that, given any two natural numbers n and m, outputs whether n < m according to that order.

There’s an injection from computable well-orders on ℕ to Turing machines, so there are countably many computable well-orders (at most one for each Turing machine). This means that there’s a list of all such computable well-orders. Let’s label the items on this list in order: {<0, <1, <2, <3, …}. Each <n stands for a Turing machine which takes in two natural number inputs and returns true or false based on the nth computable well-ordering.

Now, we want to construct a set with order type ω1CK. We’re going to have it be an order on the natural numbers, so what we have to work with from the start is the set {0, 1, 2, 3, 4, 5, 6, …}.

We start by splitting up this set into an infinite collection of sets. First, we take every other number, starting with 0: {0, 2, 4, 6, 8, …}. That’ll be our first set. We’ve left behind {1, 3, 5, 7, 9, …}, so next we take every other number from this set starting with 1: {1, 5, 9, 13, …}. Now we’ve left behind {3, 7, 11, 15, …}. Our next set, predictably, will be every other number from this set starting with 3. This is {3, 11, 19, 27, …}. And we keep on like this forever, until we’ve exhausted every natural number.

This gives us the following set of sets:

{{0, 2, 4, 6, …}, {1, 5, 9, 13, …}, {3, 11, 19, 27, …}, {7, 23, 39, 47, …}, …}

Now, we define our order to make everything in the first set less than everything in the second set, and everything in the second set less than everything in the third set, and so on. So far so good, but this only gives us order type ω2. Next we define the order WITHIN each set.

To do this, we’ll use our enumeration of computable orders. In particular, within the nth set, the ith number will be less than the jth number exactly in the case that i <n j. You can think about this as just treating each set “as if” it’s really {0, 1, 2, 3, …}, then ordering it according to <n, and then relabeling the numbers back to what they started as.

This fully defines our order. Now, what’s the order type of this set? Well, it can’t be any particular computable order type, because every computable order type can be found as a sub-order of it! So it must be some uncomputable order type. And by construction, we know it must be exactly the order type that contains all and only computable order types as suborders. But this is just the order type of the Church-Kleene ordinal!

So there we have it! We’ve constructed ω1CK. Now, remember that the Church-Kleene ordinal is uncomputable. But I just described an apparently well-defined process for creating an ordered set with its order! This means that some part of the process I described must be uncomputable. Challenge question for you: Which part is uncomputable, and can you think of a proof of its uncomputability that is independent of the uncomputability of ω1CK?

What ordinals can be embedded in ℚ and ℝ?

Last time we talked a little bit about some properties of the order type of ℚ. I want to go into more detail about these properties, and actually prove them to you. The proofs are nice and succinct, and ultimately rest heavily on the density of ℚ.

Every Countable Ordinal Can Be Embedded Into ℚ

Take any countable well-ordered set (X, ≺). Its order type corresponds to some countable ordinal. Since X is countable, we can enumerate all of its elements (the order in which we enumerate the elements might not line up with the well-order ≺). Let’s give this enumeration a name: (x1, x2, x3, …).

Now we’ll inductively define an order-preserving bijection from X into ℚ. We’ll call this function f. First, let f(x1) be any rational number. Now, assume that we’ve already defined f(x1) through f(xn-1) in such a way as to preserve the original order ≺. All we need to do to complete the proof is to assign to f(xn) a rational number such that the ≺ is still preserved.

Here’s how to do that. Split up the elements of X that we’ve already constructed maps for as follows: A = {xi | xi ≺ xn} and B = {xi | xi > xn}. In other words, A is the subset of {x1, x2, …, xn-1} consisting of elements less than x_n and B is the subset consisting of elements greater than xn. Every element of B is strictly larger than every element of A. So we can use the density of the rationals to find some rational number q in between A and B! We define f(xn) to be this rational q. This way of defining f(xn) preserves the usual order, because by construction, f(xn) < f(xi) for any i less than n exactly in the case that xn < xi.

By induction, then, we’ve guaranteed that f maps X to ℚ in such a way as to preserve the original order! And all we assumed about X was that it was countable and well-ordered. This means that any countable and well-ordered set can be found within ℚ!

No Uncountable Ordinals Can Be Embedded Into ℝ

In a well-ordered set X, every non-maximal element of X has an immediate successor (i.e. a least element that’s greater than it.) Proof: Take any non-maximal x ∈ X. Consider the subset of X consisting of all elements greater than x: {y ∈ X | x < y}. This set is not empty because α is not maximal. Any non-empty subset of a well-ordered set has a least element, so this subset has a least element. I.e, there’s a least element greater than x. Call this element S(x), for “the successor of x”,

Now, take any well-ordered subset X ⊆ ℝ (with the usual order). Since it’s well-ordered, every element has an immediate successor (by the previous paragraph). We will construct a bijection that maps X to ℚ, using the fact that ℚ is dense in ℝ (i.e. that there’s a rational between any two reals). Call this function f. To each element x ∈ X, f(x) will be any rational such that x < f(x) < S(x). This maps every non-maximal element of X to a rational number. To complete this, just map the maximal element of X to any rational of your choice. There we go, we’ve constructed a bijection from X to ℚ!

The implication of this is that every well-ordered subset of the reals is only countably large. In other words, even though ℝ is uncountably large, we can’t embed uncountable ordinals inside it! The set of ordinals we can embed within ℝ is exactly the set of ordinals we can embed within ℚ! (This set of ordinals is exactly ω1: the set of all countable ordinals).

Final Note

Notice that the previous proof relied on the fact that between any two reals you can find a rational. So this same proof would NOT go through for the hyper-reals! There’s no rational number (or real number, at that!) in between 1 and 1+ϵ. And in fact, you CAN embed ω1 into the hyperreals! This is especially interesting because the hyperreals have the same cardinality as the reals! So the embeddability of ω1 here is really a consequence of the order type of the hyperreals being much larger than the reals. And if we want to take a step towards even crazier extensions of ℝ, EVERY SINGLE ordinal can be embedded within the surreal numbers!

Visualizing the rationals

I made some pretty pictures of the rational numbers! Take a look.

This first plot shows the denominators of rational numbers between -1 and 1. It’s obtained by searching through every rational number in fully reduced form with numerator and denominator between -1000 and 1000.

Now here’s a plot of the numerators of rational numbers between -1 and 1.

Here’s the same thing, but only looking at rationals from 0 to 1:

And for kicks, here’s the inverses of the numerators:

The same, for rationals between 0 and 1:

These pictures are quite pretty to look at and interesting to think about. Some features of the image are artifacts of the particular way I chose rational numbers (i.e. looking only at rationals with numerator and denominator in a fixed range), but some other features might reflect some deeper structure of the rationals.

The order type of the rational numbers

Take any set and place an order on it. If the order is a well-order (i.e. if every subset has a least element), then the set has the same order type as some particular ordinal. But the notion of order type can be extended beyond well-ordered sets. Any two ordered sets are said to have the same order type if they are order isomorphic: if there’s a bijective map f from one set to the other such that both f and f-1 preserve the ordering of elements.

One structure that has a very interesting order type is the rational numbers. After all, the rationals are a countable set, but every rational number resembles a “limit ordinal” in the sense that it has no immediate predecessor.

One question that we can ask to get some insight about the order type of the rational numbers is: what ordinals can be found within the rationals? That is, take some well-ordered subset of the rationals. Look at the order type of this subset. This order type corresponds to some ordinal. And different choices of well-ordered subsets of the rationals give us different ordinals! So which ordinals can we “find” within the rationals in this sense?

First off, every finite ordinal can obviously be found. To find the ordinal n, just take the subset {0, 1, 2, …, n-1}. We can also find ω! You can just take the subset {0, 1, 2, 3, …}. What about ω+1? Try for yourself: can you construct a subset of the rationals with the order type of ω+1?

There’s no unique way to do it, but one easy way is to take the subset {1/2, 3/4, 7/8, …, 1}. This set has a countable infinity of elements, one for each natural number, after all of which comes a single element: exactly the order type of ω+1!

If we want a subset of the rationals with order type ω+2, we can use the same trick: {1/2, 3/4, 7/8, …, 1, 2}. And clearly this extends for any ω+n. But how about ω+ω? Can you construct a subset of the rationals with this order type? (Do it yourself before reading on!)

Here’s one way: {1/2, 3/4, 7/8, …, 1+1/2, 1+3/4, 1+7/8, …}. It should be easy to see how to extend this trick to get ω⋅3, ω⋅4, and indeed ω⋅n for any finite n. You can even naturally extend this to get to ω⋅ω. But then what? Are we finished?

If you guessed no, you’re right! We can find subsets of the rationals with order type ω3, and ωn for any finite n, and ωω. (Try it!) And we can keep going beyond that as well. So how high can you go?

Turns out that EVERY countable ordinal can be embedded into the rationals, under the usual order! So in some sense the order type of the rationals is as complicated as possible while still being countable. Also, remember that the set of countable ordinals is uncountably large. So this means that the rationals are a countable ordered set that has all of the uncountably many countable ordinals embedded within it! Isn’t that great?

(One thing that makes this seem less insane is that when we’re looking at what ordinals can be embedded in the rationals, we’re searching through different subsets of the rationals. And even though the rationals are countable, the set of all subsets of the rationals is uncountably large.)

One more interesting thing: the set of all subsets of the rationals has cardinality ℶ1. But the set of all ordinals that can be embedded into the rationals is ω1, which has cardinality ℵ1. So if we start with the set of all subsets of the rationals, then strip away all but those that are well-ordered, and then choose just a single representative for each order type, we get a set of cardinality ℵ1. And there is no guarantee that this final set has the same cardinality as the original set, because this is the continuum hypothesis! This is a way to think about how to “construct” a real life set with cardinality ℵ1: look at the set of well-ordered subsets of the rationals, and split it into order-type equivalence classes.

Can you compute all the countable ordinals?

One can live without knowing the ordinals, to be sure, but not as well.

Quote from this paper

What is an ordinal number? If you read my previous post, you might be convinced that an ordinal number is a particular type of set that contains all the ordinal numbers less than it. In particular, the ordinal 0 is the empty set, the ordinal 1 is the set {0}, the ordinal 2 is the set {0,1}, and so on. This is a fine way to think about ordinals, but there’s a much deeper view of them that I want to present in this post.

The fundamental concept is that of order types. Order types are to order-preserving bijections as cardinalities are to bijections. If you’ve seen pictures like these before, then they’re good to have in mind when you think about what order types are:

Screen Shot 2020-05-21 at 12.57.01 AM

Let’s get into more detail about order types.

Take any two sets. If there is a bijection between them, then we say that they have the same cardinality. So we can think of each cardinal number as a class of sets, all in bijective correspondence with one another.

Take any two well-ordered sets. (Recall, a set is well-ordered if each of its subsets has a least element.) If there is an order-preserving bijection between them, then we say that the two sets have the same order type. So we can think of each ordinal number as a class of sets, all of the same order type.

For instance, the cardinal number “2” is the class of all sets with two elements (as all such sets are in bijective correspondence with one another). And the ordinal number “2” is the class of all well-ordered two-element sets. In particular, Von Neumann’s ordinal 2 = {0, 1} = {∅, {∅}} has this order type.

The finite cardinalities and finite order types line up nicely: any two well-ordered sets with the same finite cardinality are guaranteed to have the same order type. It’s in the realm of the infinite where order types and cardinalities come apart.

The cardinal number ℵ0 is the class of all sets that can be put into bijective correspondence with the set of natural numbers ω. This includes the set of even numbers and the set of rational numbers. So how many order types are there of sets with cardinality ℵ0?

Consider the following two ordered sets: X = {0 < 1 < 2 < 3 < …} and Y = {1 < 2 < 3 < … < 0}. They both have cardinality ℵ0, but do they have the same order types? If so, then there must be a bijection f: X → Y such that f is order-preserving (for all a ∈ X and b ∈ X, if a < b then f(a) < f(b)). But what element of X is mapped to 0 in Y? In the set Y, 0 is larger than every element besides itself, but there is no element of X that has this property! So no, X and Y do not have the same order types.

X is the order type of ω (a countably infinite list of elements with no elements that have infinitely many predecessors), and Y is the order type of ω+1 (a countably infinite list of elements with a single element coming after all of them). This tells us that there are at least two different order types within the cardinality ℵ0.

Can we construct an order on the set of natural numbers that has the same order type as ω+2? Well, all we need is for there to be two elements that are larger than all the rest. So we can write something like {2 < 3 < 4 < … < 0 < 1}. This generalizes easily to ω+n, for any finite n: {n+1 < n+2 < … < 0 < 1 < … < n} has the same order type as ω+n.

But now what about ω+ω, i.e. ω⋅2? To get a set of naturals with the same order type, we have to use a new trick. We need two countably infinite sequences of numbers that we can place beside one another. An easy way to accomplish this is by using the evens and odds: {0 < 2 < 4 < … < 1 < 3 < 5 < …}.

We can get an order on the naturals with the same order type as ω⋅2 + 1 by just choosing a single natural number to place after all the others. For instance: {2 < 4 < 6 < … < 1 < 3 < 5 < … < 0}. It should be easy to see how to get ω⋅2 + n for any finite n: just do {n < n+2 < n+4 < … < n+1 < n+3 < n+5 < … < 0 < 1 < 2 < … < n-1} And now how about ω⋅3?

(Try it out for yourself before reading on!)

To get ω⋅3, we need to place three countably infinite sequences of naturals side by side. For instance: {0 < 3 < 6 < … < 1 < 4 < 7 < … < 2 < 5 < 8 < …}.

For ω⋅4, we can use a similar trick: {0 < 4 < 8 < … < 1 < 5 < 9 < … < 2 < 6 < 10 < … < 3 < 7 < 11 < …}. Again, we can quite easily generalize this to ω⋅n for any finite n: {0 < n < 2n < … < 1 < n+1 < 2n+1 < … < 2 < n+2 < 2n+2 < … < … < n-1 < 2n-1 < 3n-1 < …}. But how about ω⋅ω? How do we deal with ω2?

This is a fun exercise to try for yourself at home. Construct an order on the set of natural numbers that has the same order type as ω2. Then try to generalize the trick to get ω3, ω4, and ωn for any finite n.

Next construct an order on the naturals with the same order type as ωω! Can you do it for ω^ω^ω? And ω^ω^ω^…? One thing that should become clear is that the larger an ordinal you’re dealing with, the harder it becomes to construct an order on the naturals with the same order type. The natural question is: is it always possible to do this type of construction, or do we at some point run out of clever tricks to use to get to higher and higher countable ordinals?

Well, if an ordinal α is countable, then it is a well-ordered set with the same cardinality as the naturals. So there always exists some order that can be placed on the naturals to mimic the order type of α. But is this order always computable?

We call an order on the naturals computable if there’s some Turing machine which takes as input two naturals x and y and outputs whether x < y according to this order. We call an order type computable if there’s some computable order on the naturals with that order type. The standard order (0 < 1 < 2 < 3 < …) is computable, and it’s easy to see how to compute all the other orders we’ve discussed so far. But are ALL the countable order types computable?

The wonderful and strange answer is no. There are countable order types that are uncomputable! There must exist orders on the naturals with such order types, but these orders are so immensely complicated and strange that they cannot be defined by ANY Turing machine! Here’s a quick and easy proof of this. (The next three paragraphs are the coolest and most important part of this whole post, so pay attention!)

Turn your mind back to Von Neumann’s construction of the ordinals, according to which each ordinal was exactly the set of all smaller ordinals. More technically, a set α is a Von Neumann ordinal iff α is well ordered with respect to ∈ AND every element of α is also a subset of α. Consider now the set of all countable ordinals. It can be seen that this set is itself an ordinal. Can it be a countable ordinal? No, because if it were then it would have to be an element of itself! (This is not allowed in ZFC by the axiom of regularity.) This means that there are uncountably many countable ordinals.

Ok, so far so good. Now we just observe that there are only countably many Turing machines. So there are only countably many Turing machines that compute orders on the naturals. And therefore there are only countably many computable ordinals.

Putting this together, we see that there are uncountably many uncomputable countable order types (say that sentence out loud). After all, there are uncountably many countable ordinals, and only countably many computable ordinals!

And if there’s only countably many computable ordinals, then there’s some countable Von Neumann ordinal consisting of the set of all computable ordinals! This set couldn’t possibly be computable, because then it would be contained within itself. So it’s the smallest uncomputable ordinal! This set is called the Church-Kleene ordinal. Its order type is literally so convoluted and complex that no Turing machine can compute it!

Constructing ordinal numbers in ZFC

Today I want to talk about ordinal numbers in ZFC set theory. VSauce does a great job introducing his viewers to the concepts of ordinal vs cardinal numbers, and giving a glimpse into the weird and wild world of mathematical infinity. I want to go a bit deeper, and show exactly how the ordinals are constructed in ZFC. Let’s begin!

Okay, so first of all, we’re talking about first-order ZFC, which is an axiomatic formalization of set theory in first-order logic. As a quick reminder, first order logic gives us access to the following alphabet of symbols: ( ) , ∧ ∨ ¬ → ↔ ∀ ∃ =, as well as an infinite store of variables (x, y, z, w, v, u, and so on). A first order language also includes a store of constant symbols, relation symbols, and function symbols.

For first-order set theory, we are going to add only a single extra-logical symbol to our alphabet: the “is-an-element-of” relation ∈. This is pretty remarkable when you consider that almost all of mathematics can be done with just ZFC! In some sense, you can give a pretty good description of mathematics as the study of the elementhood relation! Using just ∈ we can define everything we need, from ⊆ and ⋃ and ⋂ to the empty set ∅ and the power set function P(x). In fact, as we’ll see, we’re even going to define numbers using ∈!

The elementhood relation ∈ is given its intended meaning by the axiom of extensionality: ∀x∀y (x=y ↔ ∀z (z∈x ↔ z∈y)). In plain English this says that two sets are the same exactly when they have all the same elements.

The semantics of first order logic has two parts: a “universe” of individuals that are quantified over by ∀ and ∃, and an interpretation of each of the constant symbols (as individual objects in the universe), the relation symbols (as maps from the universe to truth values), and the function symbols (as maps from the universe to itself).

Our universe is going to be entirely composed of sets. This means that sets won’t be composed of non-set elements; the elements of non-empty sets are always themselves sets. And those sets themselves, if non-empty, are made out of sets. It’s sets all the way down!

Now, the topic of this essay is ordinal numbers in ZFC. So if everything in ZFC is a set, guess what ordinal numbers will be? You got it, sets! What sets? We can translate from ordinals to sets in a few words: the ordinal 0 is translated as the empty set, and every other ordinal is translated as the set of all smaller ordinals.

This tells us that 1 = {0}, 2 = {0, 1}, 3 = {0, 1, 2}, and so on. If we were to write these ordinals entirely in set notation, it would look like: ∅, {∅}, {∅,{∅}}, {∅,{∅},{∅,{∅}}}, and so on. The choice to associate these particular sets with the natural numbers is a convention introduced by John Von Neumann (it is, however, an exceedingly wise convention, and has many virtues that competing conventions do not have, as will become clearer once we ascend to the transfinite).

So, now we know that we are going to associate the finite ordinals with the empty set and supersets of the empty set. But of course, we haven’t yet even shown that the empty set exists in ZFC! To actually construct the empty set in ZFC, we have to add some more axioms. Let’s start with the obvious one: a sentence that asserts the existence of the empty set:

Axiom of Empty Set: ∃x∀y ¬(y∈x)

In other words, there’s some set x that contains no sets. Notice that we didn’t refer to the empty set by name. We can’t refer to it by name in our axioms, because we haven’t included any constant symbols in our language! To actually talk about the particular set ∅ (equivalently, 0), we can use the rule of existential instantiation, which allows us to remove any existential quantifier, as long as we change the name of the quantified variable to something that has not been previously used. So, for example, in any particular proof, we can do the following:

1.  ∃x∀y ¬(y∈x) (Axiom of Empty Set)
2. ∀y ¬(y∈0) (from 1 by existential instantiation)

This is allowed so long as the symbol 0 has not appeared anywhere previously in the proof.

Now that we have the empty set, we need to be able to construct 1={0} and 2={0,1}. To do this, we introduce the axiom of pairing:

Axiom of Pairing: ∀x∀y∃z∀w (w∈z ↔ (w=x ∨ w=y))

This says that we can take any two sets x and y and form a new set z = {x, y}. We can right away use this axiom to construct the number 1.

3. ∀x∀y∃z∀w (w∈z ↔ (w=x ∨ w=y)) (Axiom of Pairing)
4. ∃z∀w (w∈z ↔ (w=0 ∨ w=0)) (from 3 by universal instantiation)
5. ∀w (w∈1 ↔ (w=0 ∨ w=0)) (from 4 by existential instantiation)
6. ∀w (w∈1 ↔ w=0)

We went from 3 to 4 by using universal instantiation (instantiating both variables x and y as 0), and from 4 to 5 by using existential instantiation (instantiating z as 1, which is allowed because we haven’t used the symbol 1 yet). The step from 5 to 6 is technically skipping a bunch of steps. Even though it’s obvious that we can replace (w=0 ∨ w=0) with w=0 inside the formula, there isn’t any particular rule of inference in first order logic that does it in one step. But since it is obvious that 5 semantically entails 6, and since first order logic has a sound and complete proof system, we know that 5 also syntactically entails 6.

To construct 2, we can simply use pairing again with 0 and 1:

7. ∃z∀w (w∈z ↔ (w=0 ∨ w=1)) (from 3 by universal instantiation)
8. ∀w (w∈2 ↔ (w=0 ∨ w=1)) (from 4 by existential instantiation)

We might think that we could simply use pairing once more to get 3 = {0,1,2}. But this won’t quite work. If we use pairing on 1 and 2 we get {1,2}, and if we use pairing again on this and 0 we get {0,{1,2}}, not {0,1,2}. In fact, we can easily see that any usage of pairing always produces a set with exactly two elements. And the set 3 has three elements! So pairing is not enough to get us where we want to go. To get to 3, we need a stronger axiom, which will allow us to take the union of sets.

Axiom of Union: ∀x∃y∀z (z∈y ↔ ∃w(z∈w ∧ w∈x))

This says that for any set x, we can construct a new set y which consists of the union of all sets in x. In other words, if z is an element of y, then z must be contained in some set w that’s an element of x.

Now, here’s how we’re going to construct 3. First we construct the set {2} by pairing 2 with itself. Then we construct the set {2,{2}} with pairing again. Then we union all the elements of this set to get 2⋃{2}. Now remember that 2 = {0,1}, so 2⋃{2} = {0,1}⋃{2} = {0,1,2}. And that’s 3!

Constructing {2}:

9. ∃z∀w (w∈z ↔ (w=2 ∨ w=2)) (from 3 by universal instantiation)
10. ∀w (w∈{2} ↔ (w=2 ∨ w=2)) (from 9 by existential instantiation)
11. ∀w (w∈{2} ↔ w=2)

Constructing {2,{2}}:

12. ∃z∀w (w∈z ↔ (w=2 ∨ w={2})) (from 3 by universal instantiation)
13. ∀w (w∈{2,{2}} ↔ (w=2 ∨ w={2})) (from 12 by existential instantiation)

Constructing 3:

14. ∀x∃y∀z (z∈y ↔ ∃w(z∈w ∧ w∈x)) (Axiom of Union)
15. ∃y∀z (z∈y ↔ ∃w(z∈w ∧ w∈{2,{2}})) (from 9 by universal instantiation)
16. ∀z (z∈3 ↔ ∃w(z∈w ∧ w∈{2,{2}})) (from 10 by existential instantiation)

Technically, we’ve now constructed 3. But let’s neaten this up and show that this set is really what we wanted (using more of the intuitive semantic arguments from before to skip many tedious steps).

17. ∀z (z∈3 ↔ ∃w(z∈w ∧ (w=2 ∨ w={2}))) (from 13,16)
18. ∀z (z∈3 ↔ ∃w((z∈w ∧ w=2) ∨ (z∈w ∧ w = {2})))
19. ∀z (z∈3 ↔ (z∈2 ∨ z∈{2}))
20. ∀z (z∈3 ↔ (z∈2 ∨ z=2)) (from 11,19)
21. ∀z (z∈3 ↔ (z=0 ∨ z=1 ∨ z=2)) (from 8,20)

We can construct 4 in pretty much the exact same way: first use pairing to construct {3} and then {3,{3}}, and then use union to construct 3⋃{3} = {0,1,2}⋃{3} = {0,1,2,3} = 4. I’ll go through this all formally one more time, more quickly than before:

22. ∀w (w∈{3} ↔ w=3) (pair 3 with 3)
23. ∀w (w∈{3,{3}} ↔ (w=3 ∨ w={3})) (pair 3 with {3})
24. ∀z (z∈4 ↔ ∃w(z∈w ∧ w∈{3,{3}})) (union {3,{3}})
25. ∀z (z∈4 ↔ (z=0 ∨ z=1 ∨ z=2 ∨ z=3)) (neatening up)

It should be clear that this process can continue indefinitely. From the ordinal 4 we can construct 5 by taking the union of 4 and {4}, and from 5 we can construct 6 = 5⋃{5}. And so on. In fact, we can define the successor of any set x as exactly the set x⋃{x}: S(x) = x⋃{x}. And using the above construction, we know that this successor set will always exist!

Wonderful! So now we have an outline for the construction of every natural number. What’s next? What comes after 0, 1, 2, 3, 4, and so on? The first infinite ordinal, ω! Just as 4 is the set {0,1,2,3}, and 10 is the set of all the ordinals before it, ω is defined as exactly the set of all previous ordinals. In other words, ω is the set of all natural numbers! ω = {0,1,2,3,…}.

Now, how can we construct ω in ZFC? Can we do it using the axioms we have so far? You might be tempted to say something like “Sure! You’ve just demonstrated a process that constructs n+1 from any n, and we know how to construct 0 already. So don’t we already have the ability to construct all the natural numbers?”

Well, hypothetical you, it’s true that we now know how to construct each natural number. But constructing the infinite set containing all natural numbers is an entirely different matter. Remember that proofs are only allowed to be finitely long! So in any proof using only the methods we’ve used so far, we can only construct finitely many natural numbers (proportional to the length of the proof). To get ω, we need something more than the axioms we have so far. Introducing: the axiom of infinity!

But before we get there, I want to construct a handy bit of shorthand which will make what comes next a lot easier to swallow. What we’ll do is write out as a first-order sentence the assertion “x is the successor of y”, as well as the sentence “x is the empty set”, and then introduce a shorthand notation for them. Trust me, it will make life a lot easier.

First, “x is the successor of y”, which we can also write as “x = y⋃{y}”. Try this for yourself before reading on! Ok, now that you’re back, here it is: ∀z (z∈x ↔ (z∈y ∨ z=y)). We’ll call this sentence Succ(x,y). So if you ever see “Succ(x,y)” in the future, read it as “x is the successor of y” and know that if we wanted to be fully formal about it we could replace it with “∀z (z∈x ↔ (z∈y ∨ z=y))”.

Good! Now, let’s do the same with the sentence “x is the empty set”, which is the same thing as “x is 0”. Try it for yourself! And now, here it is: ∀y ¬(y∈x). We’ll call this sentence isZero(x).

Now we’re ready for the axiom of infinity!

Axiom of Infinity: ∃x∀y ((isZero(y) → y∈x) ∧ (y∈x → ∃z (Succ(z,y) ∧ z∈x)))

If this axiom looks like a lot to comprehend, imagine it without our shorthand! Conceptually, what’s going on with this axiom is actually pretty simple. We’re just asserting that there exists an infinite set x that contains 0, and that is closed under the successor operation. So this set is guaranteed to contain 0, as well as the successor of 0 (1), and the successor of the successor of 0 (2), and the successor of this (3), and so on forever. (Bonus question: what does the set theoretic universe look like if we remove the axiom of infinity and add its negation as an axiom instead? What mathematical structure is it isomorphic to?)

Quiz question for you: have we now constructed ω? That is, the axiom of of infinity does guarantee us the existence of a set, but are we sure that that set is exactly the set of natural numbers and nothing more?

The answer is no. The axiom of infinity does guarantee us the existence of an infinite set, and we know for sure that this set contains all the natural numbers, but there’s nothing guaranteeing that it doesn’t also contain other sets! To actually obtain ω, we need one more axiom. This axiom will be the most powerful one we’ve seen yet: the axiom of comprehension.

Axiom of Comprehension: ∀x∃y∀z (z∈y ↔ (z∈x ∧ φ(z)))

This tells us that for any set x, we can construct a new set y, which consists of exactly the elements of x that have a certain property φ. In set-builder notation, we can write: y = {z∈x: φ(z)}. (Bonus question: why do we have to define y as the subset of x that satisfies φ? Why not just say that there exists a set of all sets that satisfy φ? This unrestricted comprehension axiom appeared in the early formalizations of set theory, but there was a big problem with it. What was it?)

You may notice that there’s something different about this axiom than the previous ones. What’s up with that symbol φ(z)? Well, φ(z) is a stand-in for any well-formed formula in the language of ZFC, so long as φ contains only z as a free variable. What that means is that there’s actually not one single axiom of comprehension, but a countably infinite axiom schema, one for each well-formed formula φ.

For instance, we have as one instance of the axiom of comprehension that ∀x∃y∀z (z∈y ↔ (z∈x ∧ z=z)). As another instance of the axiom, we have ∀x∃y∀z (z∈y ↔ (z∈x ∧ z≠z)). Both of these are pretty trivial examples: in the first case the set y is exactly the same as x (as all sets are self-identical), and in the second case y is the empty set (as no set z satisfies the property z ≠ z). But we can do the same thing for any property whatsoever, so long as it can be encoded in a sentence of first-order ZFC. (Another bonus question: One of the axioms I’ve mentioned before has now been obviated by the introduction of these new axioms. Can you figure out which it is, and produce its derivation?)

We use the axiom of comprehension to “carve ω out” from the set whose existence is guaranteed by the axiom of infinity (remember, we already know for sure that this set contains all the natural numbers, it’s just that it might contain more elements as well). So what we need is to construct a sentence φ(z) such that the only set z that satisfies the sentence is the set of all natural numbers ω.

There are several such sentences. I’ll briefly present one simple one here. Again we’ll introduce a convenient shorthand for the sake of sanity. Take a look at this sentence that we saw earlier: “∀y ((isZero(y) → y∈x) ∧ (y∈x → ∃z (Succ(z,y) ∧ z∈x)))”. What this sentence says is that x is a superset of ω (it contains 0 and is closed under successorhood). So we’ll call this sentence “hasAllNaturals(x)”.

Now, we can write the following sentence: ∀x (hasAllNaturals(x) → z∈x).

Consider what this sentence says. It tells us that z is an element of every set that contains all the naturals. But one such set is the smallest set containing all the naturals, i.e. ω! So z must be an element of ω. In other words, z is a natural number. So this sentence will do for our definition of φ(z).

φ(z): ∀x (hasAllNaturals(x) → z∈x)

Just like with hasAllNaturals(z) and Succ(x,y) and isZero(x), you should read φ(z) as simply a shorthand for the above sentence. If we really wanted to torture ourselves with formality, we could write out the entire sentence using only the allowed symbols of first-order ZFC.

Now we can finally get ω. Let’s continue with our proof progression from earlier. We left off at 25, so:

26. ∃x∀y ((isZero(y) → y∈x) ∧ (y∈x → ∃z (Succ(z,y) ∧ z∈x))) (Axiom of Infinity)
27. ∀y ((isZero(y) → y∈inf) ∧ (y∈inf → ∃z (Succ(z,y) ∧ z∈inf)))
28. ∀x∃y∀z (z∈y ↔ (z∈x ∧ φ(z))) (Axiom of Comprehension)
29. ∃y∀z (z∈y ↔ (z∈inf ∧ φ(z)))
30. ∀z (z∈ω ↔ (z∈inf ∧ φ(z)))

In going from line 26 to 27, we gave the infinite set guaranteed us by the axiom of infinity a placeholder name, “inf”. Line 30 is what we’ve been aiming for for the last few hundred words, and it honestly looks a little underwhelming. It’s not so immediately clear from this line that ω has all the properties that we want of the natural numbers. But at the same time, we couldn’t write something like “∀z (z∈ω ↔ (z=0 ∨ z=1 ∨ z=2 ∨ …)), because first-order logic is finitary (we aren’t allowed infinitely long sentences). So we have to make do with a definition of ω that may look a little more abstract that we may like. Suffice it to say that line 30 really does serve as an adequate definition of ω. It tells us that ω is the smallest set that contains all natural numbers. From this, we can pretty easily show that any particular natural number is an element of ω, and (less easily) that any other set (say, the set {2} or {5,1}) is not an element of ω.

If you’ve followed so far, give yourself a serious pat on the back. Together we’ve ascended past the realm of the finite to our first transfinite ordinal. This is no small accomplishment. But our journey does not end here. In fact, it has only barely begun. We’re going to start picking up speed from here on out, because as you’ll see, the ground we have yet to cover is much much greater than the ground we’ve covered so far.

The first step is easy. We already saw earlier how you can construct for any set x its successor set x⋃{x}. This construction didn’t rely on our sets being finite, it works just as well for the set ω. So ω has a successor! We’ll call it ω+1! Don’t believe me? I’ll prove it to you:

31. ∀x (x∈{ω} ↔ x=ω) (pair ω with ω)
32. ∀x (x∈{ω,{ω}} ↔ (x=ω ∨ x={ω})) (pair ω with {ω})
33. ∀x (x∈ω+1 ↔ ∃y(x∈y ∧ y∈{ω,{ω}})) (union {ω,{ω}})
34. ∀x (x∈ω+1 ↔ (x∈ω ∨ x=ω)) (neatening up)

There we have it! ω+1 = {0,1,2,3,…,ω}

And it doesn’t stop there: we can construct ω+2 = {0,1,2,3,…,ω,ω+1}. And ω+3 = {0,1,2,3,…,ω,ω+1,ω+2}. And so on, forever! By allowing the existence of one infinity, we’ve actually entailed the existence of an infinity of infinities!

But what’s next? We now have all the finite ordinals, and all the infinite ordinals of the form ω+n for finite n. What comes after this? Clearly, the next ordinal is just the set of all finite ordinals as well as all ordinals of the form ω+n! This ordinal is the smallest ordinal that’s larger than ω+n any finite n. So a natural name for it is ω+ω, or ω⋅2!

So conceptually ω⋅2 makes sense, but can we actually construct it? At first glance, this may seem unlikely. The axiom of infinity contains no guarantee that the infinite set it grants us contains ω, to say nothing of ω+1, ω+2, and the rest. And no finite amount of constructing successors will allow us to make the jump to ω⋅2 (for much the same reason as we needed the axiom of infinity to make the jump to ω). So perhaps we need to have a new axiom asserting the existence of ω⋅2? And then maybe we need a new axiom for ω⋅3, and ω⋅4, and so on forever! That would be a sad situation.

Well, things aren’t quite that bad. It turns out that we do need a new axiom. But we can do better than just guaranteeing the existence of ω⋅2. We’ll introduce an axiom schema that is by far the most powerful of all the axioms of ZFC. This axiom schema will take us far beyond ω⋅2, beyond ω⋅ω even, and far far beyond ω^ω and ω^ω^ω^… with infinitely many exponents. Introducing: the Axiom of Replacement! (dramatic music plays)

But first, we’re going to need to talk a little bit about the set theoretic notion of functions. I promise, it’ll be as quick as I can manage. Remember how we chose our language for ZFC to have no function symbols, only the single relation symbol ∈? This means that we have to build in functions through different means. We’ve already seen a hint of it when we talked about the sentence Succ(x,y) which said that x was the successor of y. This sentence is true for exactly the sets x and y such that x is the successor of y. We can think of the sentence as “selecting” ordered pairs (x,y) such that x = {y}. In other words, this sentence is filling the role of defining the function y ↦ {y}.

The same applies more generally. For any function f(x), we can construct the sentence F(x,y) which asserts “y = f(x)”. For instance, the identity function id(x) = x will be defined by the sentence Id(x,y): “y=x”. Notice that not all functions from sets to sets can be defined in this way, as we only have countably many sentences in our language to work with.

Suppose we’re handed a sentence φ(x,y). How are we to tell if φ(x,y) represents a function or just an ordinary sentence? Well, functions have the property that any input is mapped to exactly one output. We can write this formally: “∀x∀y∀z ((φ(x,y) ∧ φ(x,z)) → y=z)”. For shorthand, we’ll call this sentence isAFunction(φ).

And now we’re ready for the axiom schema of replacement.

Axiom of Replacement: isAFunction(φ) → ∀x∃y∀z (z∈y ↔ ∃w (w∈x ∧ φ(w,z)))

In English, this says that for any definable function f (defined by φ), and for any set of inputs x, the image of x under f exists. Like with the axiom schema of comprehension, this is a countably infinite collection of axioms, one for each well formed formula φ(w,z).

Let’s use this axiom to construct ω⋅2. First we define a function f that takes any finite number n to the ordinal ω+n. (Challenge: Try to explicitly define this function! Notice that the most obvious method for defining the function requires second-order logic. Try to come up with a trick that allows it to be defined in first-order ZFC!) Then we prove that this really is a function. Then we apply the axiom of replacement with our domain as ω (the set of all natural numbers). The image of ω under f is the set {ω,ω+1,ω+2,ω+3,…}. Not quite what we want. But we’re almost there!

Next we use the axiom of pairing to pair this newly constructed set with ω itself, giving us {{0,1,2,…}, {ω,ω+1,ω+2,…}}. And finally, we apply the axiom of union to this set, giving us {0,1,2,…,ω,ω+1,ω+2,…} = ω⋅2!

Phew, that was a lot of work just to make the jump to from ω to ω⋅2. But it was worth it! Now we can also jump to ω⋅3 in the exact same way!

Define a function f that maps n to ω⋅2+n. Now use replacement with ω as the domain to construct the set {ω⋅2, ω⋅2+1, ω⋅2+2, …}. Pair this set with ω⋅2, and union the result. This gives ω⋅3!

In exactly the same way, we can use the axiom of replacement to get ω⋅4, ω⋅5, ω⋅6, and so on to ω⋅n for any finite n! But it doesn’t stop there. We’ve just described a procedure to get ω⋅n for any finite n. So we write it as a function!

Define f as the function that maps n to ω⋅n. Now use replacement of f with ω as the domain to get the set {0, ω, ω⋅2, ω⋅3, ω⋅4, ω⋅5, …}. Apply the axiom of union to this set, and what do you get? Well it has to contain ω⋅n for every finite n, since each ω⋅n is contained in ω⋅m for every larger m. So it’s larger than all ordinals of the form ω⋅n for finite n. This new ordinal we’ve constructed is called ω⋅ω, or ω2.

But of course, our journey doesn’t stop there. We can use replacement to generate ω2 + ω, and ω2 + ω⋅2, ω2 + ω⋅3, and so on. But we can go further and use replacement to generate ω2 + ω2, or ω2⋅2! And from there we get ω2⋅3, ω2⋅4, and so on. And applying replacement to the function from n to ω2⋅n, we get a new ordinal larger than everything before, called ω3.

As you might be suspecting, this goes on forever. We can continually apply the axiom of replacement to get mind-bogglingly large infinities, each greater than the last. But here’s the kicker: all of these infinite ordinals that we’ve created so far? They all have the same cardinality.

Yes, that’s right. You may have thought that we had transcended ω in cardinality long ago, but no. For each infinite ordinal we’ve created so far, there’s a one-to-one mapping between it and ω. Think about it: every time we used replacement, we constructed a function that mapped an existing ordinal to a new one of the same cardinality. And in applying replacement to this function, we guaranteed that the new ordinal we created cannot be a larger cardinality than the existing ordinal! So each time we use replacement with a countable domain, we get a new countable set, each of whose elements is countable. And then if we use pairing or union, we’re always going to stay in the domain of the countable (unioning two countable sets just gets you another countable set, as does unioning a countable infinity of countable sets).

So now turn your mind to the following set: the set of all countable ordinals. Is this set itself an ordinal? It is if it’s the smallest uncountable ordinal, as then it contains every ordinal smaller than itself by definition! And what’s the cardinality of this set? It can’t be countable, because then it would have to be an element of itself! (Bonus question: Why is it that no ordinal can be an element of itself? Use the axiom of extensionality! Bonus question 2: In general, no set can be an element of itself. But this is not guaranteed by the axioms I’ve presented so far. The axiom that does the job is called the axiom of regularity/foundation, and it says that every set must have an element that it is disjoint with. Why does this prevent us from having sets that contain themselves?)

So this ordinal is the first uncountable ordinal. Its name is ω1. The cardinality of ω1 is the smallest cardinality that follows the cardinality of ω, and it’s called “aleph one” and written א‎1.

Now, I’ve already said that the same old paradigm we’ve used so far can’t get us to ω1. So can we get there with ZFC? It turns out that the answer is yes, and at the moment I’m not quite sure exactly how. It turns out that not only can ZFC construct ω1, it can also construct ω2 (the first ordinal with a larger cardinality than ω1), ω3, ω4, and so on forever.

So does ZFC have any limits? Or can we in principle construct every ordinal, using ever more ingenious means to transcend the limitations of our previous paradigms? The answer is no: ZFC is limited. There are ordinals so large that even ZFC cannot prove their existence (these ordinals have what’s called inaccessible cardinalities). To construct these new infinities, we must add them in as axioms, just as we had to for infinity (and indeed, for 0).

One might think that when we get to ordinals that are this mind-bogglingly large, nothing of any consequence could follow from asserting their existence. But this is not the case! Remarkably, if you add these new infinities to your theory, you can prove the consistency of ZFC. (That is, the consistency of the theory I’ve presented so far, which does not have these large cardinal axioms.) And to prove the consistency of this new theory, you must add even larger infinities. And now to prove the consistency of this one, you must expand the universe of sets again! And so on forever.

One might ask: “So how big is the universe of sets really?” At what point do we content ourselves to stop axiomatically asserting new and larger infinities, and say that we’ve obtained an adequate formalization of what the universe of sets looks like? I’m really not sure how to think about this question. Anyway, next up will be ordinal notation, and the notion of computable ordinals!

Meaning ain’t in the brain

I don’t know if there’s a name for the position that the meanings of our terms is pinned down by facts about the brain. The closest I know is semantic internalism, but a semantic internalist could think that meaning is pinned down by facts about qualia, which happen to not be facts about the brain. So I’ll make up a name for this position: call it physicalist semantic internalism.

Now, here’s an argument against physicalist semantic internalism that seems totally right to me.

What I mean by “second-order logical concepts” is the concepts of “and”, “or”, “not”, second-order quantifiers (“for all” and “for some”, ranging over not just objects but properties of objects), and the notions of functions, relations, and concepts.

  1. The semantics of second order logic captures what I mean when I use second-order logical concepts.
  2. No finite set of rules (and correspondingly no finite machine) can pin down the semantics of second order logic.
  3. So no finite machine pins down what I mean when I use second-order logical concepts.
  4. My brain is a finite machine.
  5. So my brain does not pin down what I mean when I use second-order logical concepts.

And here’s another argument along similar lines:

  1. The truth values of sentences about integers are determined by what we mean by integers.
  2. The statement of the satisfiability of each Diophantine equation has a determinate truth value.
  3. The statement of the satisfiability of each Diophantine equation is a statement about integers.
  4. So the satisfiability of each Diophantine equation is fixed by what we mean by integers.
  5. No finite machine can fix the satisfiability of each Diophantine equation.
  6. Our brain is a finite machine.
  7. So the meaning of integers is not contained in the brain.