# Hypernaturals in all their glory (Ultra Series 3)

Previous: Hypernaturals simplified

### What is an ultrafilter? (with pretty pictures)

To define an ultrafilter we need to first define a filter. Here’s a pretty good initial intuition for what a filter is: a filter on a set X is a criterion for deciding which subsets of X are “large”. In other words, a filter provides us one way of conceptualizing the idea of large and small subsets, and it allows us to do so in a way that gives us more resolution than the cardinality approach (namely, assess size of sets just in terms of their cardinality). For example, in a countably infinite set X, the cofinite subsets of X (those that contain all but finitely many elements of X) have the same cardinality as the subsets of X that are infinite but not cofinite. But there’s some intuitive sense in which a set that contains all but finitely many things is larger than a set that leaves out infinitely many things. Filters allow us to capture this distinction.

Alright, so given a set X, a filter F on X is a collection of subsets of X (i.e. it’s a subset of 𝒫(X)) that satisfies the following four conditions:

(i) X ∈ F … “X is large”
(ii) ∅ ∉ F … “the empty set is not large”
(iii) If A ⊆ B and A ∈ F, then B ∈ F … “supersets of large sets are large”
(iv) If A ∈ F and B ∈ F, then A ⋂ B ∈ F … “intersections of large sets are large”

In other words, a filter on X is a set of subsets of X that contains X, doesn’t contain the empty set, and is closed under supersets and intersection. Note that a filter is also closed under union, because of (iii) (the union of A and B is a superset of A).

An ultrafilter is a filter with one more constraint, namely that for any subset of X, either that subset or its complement is in the filter.

(v) For any A ⊆ X, either A ∈ F or (X\A) ∈ F … “a set is either large, or if not, then its complement is large”

There’s a nice way to visualize filters and ultrafilters that uses the Hasse diagram of the power set of X. For a concrete example, let X = {a, b}. We can draw the power-set of X as follows:

We draw an arrow from A to B when A is a subset of B. Now, what are the possible filters on X? There are three, see if you can find them all before reading on.

Only two of these are ultrafilters. Which two?

Remember that for an ultrafilter U, every subset or its complement is in U. So an ultrafilter always contains half of all subsets. This gives an easy way to rule out the first one.

Another example: let X = {a, b, c}. Then the power-set of X looks like:

Note that we’ve left out some arrows, like the arrow from {a} to {a,b,c}. This is okay, because transitivity of the subset relation makes this arrow redundant. Anyway, what are some filters on X? Here are three of them:

Only one of these is an ultrafilter! You should be able to identify it pretty easily. See if you can pick out the other four filters, and identify which of them are ultrafilters (there should be two). And another exercise: why is the following not a filter?

Does it have any extension that’s an ultrafilter?

One thing to notice is that in all of these examples, when something is in the filter then everything it points to is also in the filter. This corresponds to ultrafilters being closed under supersets. Also, for any two things in the filter, their meet (their greatest lower bound; the highest set on the diagram that points to both of them) is also is the filter. This corresponds to closure under intersections.

Imagine that there is a stream flowing up the Hasse diagram through all the various paths represented by arrows. Choose any point on the diagram and imagine dripping green dye into the water at that point. The green color filters up through the diagram until it reaches the top. And everything that’s colored green is in the filter! This captures the idea that filters are closed under superset, but what about intersection? If X is finite, this corresponds to the dye all coming from a single source, rather than it being dripped in at multiple distinct points. The infinite case is a little trickier, as we’ll see shortly.

One other important thing to notice is that whenever we had an ultrafilter, it always contained a singleton. An ultrafilter that contains a singleton is called a principal ultrafilter, and an ultrafilter that doesn’t contain any singletons is called a free ultrafilter. So far we haven’t seen any free ultrafilters, and in fact as long as X is finite, any ultrafilter on X will be principal. (Prove this!) But the situation changes when X is an infinite set.

The Hasse diagram for an infinite set is a bit harder to visualize, since now we have uncountably many subsets. But let’s try anyway! What does the Hasse diagram of ℕ look like? Well, we know that ∅ is at the bottom and ℕ is at the top, so let’s start there.

Next we can draw all the singleton sets. ∅ points at all of these, so we’re not going to bother drawing each individual arrow.

Next we have all the pair sets, and then the triples. Each singleton points at infinitely many pairs, and each pair points at infinitely many triples.

And so on through all finite cardinalities.

Now what? We’ve only exhausted all the finite sets. We can now start from the top with the cofinite sets, those that are missing only finitely many things. First we have the sets that contain all but a single natural number:

Then the sets containing all but a pair of naturals, and so on through all the cofinite sets.

But we’re not done yet. We haven’t exhausted all of the subsets of ℕ; for instance the set of even numbers is neither finite nor cofinite. In fact, there are only countably many finite and cofinite sets, but there are uncountably many subsets of ℕ, so there must be a thick intermediate section of infinite sets that are not cofinite (i.e. infinite sets with infinite complements).

A sanity check that this diagram makes sense: start with a finite set and then add elements until you have a cofinite set. Between the finite set and the cofinite set there’s always an intermediate set that’s infinite but not cofinite. This matches with our image: any path from the finite to the cofinite passes through the middle section.

Now, what would a filter on the naturals look like on this diagram? If our filter is principal, then we can still roughly sketch it the same way as before:

How about an ultrafilter? Depends on whether it’s principal or free. Any principal ultrafilter must look like the third image above; it must start at the “finite” section and filter upwards (remember that principal means that it contains a singleton).

Any principal ultrafilter on ℕ can be written as { A ⊆ ℕ | n ∈ A } for some n ∈ ℕ.

What about free ultrafilters? A free ultrafilter contains no singletons. This implies that it contains no finite set. See if you can come up with a proof, and only then read on to see mine.

Suppose that U is a free ultrafilter on X and contains some finite set F. U is free, so it contains no singletons. So for every a ∈ F, the singleton {a} ∉ U. By ultra, X\{a} ∈ U. By closure-under-finite-intersection, the intersection of {X\{a} | a ∈ F} is in U. So X\F ∈ U. But now we have F ∈ U and X\F ∈ U, and their intersection is ∅. So ∅ ∈ U, contradicting filter.

So a free ultrafilter must contain no finite sets, meaning that it contains all the cofinite sets. Since it’s ultra, it also contains “half” of all the intermediate sets. So visually it’ll look something like:

That’s what a free ultrafilter on the naturals would look like if such a thing existed. But how do we know that any such object actually does exist? This is not so trivial, and in fact the proof of existence uses the axiom of choice. Here’s a short proof using Zorn’s Lemma (which is equivalent to choice in ZF).

Let F be any filter on X. Consider the set Ω of all filters on X that extend F. (Ω, ⊆) is a partially ordered set, and for any nonempty chain of filters C ⊆ Ω, the union of C is itself a filter on X. (Prove this!) The union of C is also an upper bound on C, meaning that every nonempty chain of filters has an upper bound. Now we apply Zorn’s Lemma to conclude that there’s a maximal filter U in Ω. Maximality of U means that U is not a subset of V for any V ∈ Ω.

Almost done! U is maximal, but is it an ultrafilter? Suppose not. Then there’s some A in X such that A ∉ U and (X\A) ∉ U. Simply extend U by adding in A and all supersets and intersections. This is a filter that extends F and contains U, contradicting maximality. So U is an ultrafilter on X!

Now, F was a totally arbitrary filter. So we’ve shown that every filter on X has an ultrafilter extension. Now let X be infinite and take the filter on X consisting of all cofinite subsets of X (this is called the Fréchet filter). Any ultrafilter extension of the Fréchet filter also contains all cofinite subsets of X, and thus contains no singletons. So it’s free! Thus any infinite set has a free ultrafilter.

### Hypernatural numbers

Still with me? Good! Then you’re ready for the full definition of the hypernatural numbers, using ultrafilters. Take any free ultrafilter U on ℕ. U contains all cofinite sets and no finite sets, and is also decisive on all the intermediate sets. If you remember from the last post, this makes U a perfect fit for our desired “decisiveness criterion”.

Now consider the set of all countable sequences of natural numbers. Define the equivalence relation ~ on this set as follows:

(a1, a2, a3, …) ~ (b1, b2, b3, …) iff { k ∈ ℕ | ak = bk } ∈ U

Note the resemblance to our definition last post:

(a1, a2, a3, …) ~ (b1, b2, b3, …) iff { k ∈ ℕ | ak = bk } is cofinite

This previous definition corresponded to using the Fréchet filter for our criterion. But since it was not an ultrafilter, it didn’t suffice. Now, with an ultrafilter in hand, we get decisiveness!

Addition and multiplication on the hypernaturals is defined very easily:

[a1, a2, a3, …] + [b1, b2, b3, …] = [a1+b1, a2+b2, a3+b3, …]
[a1, a2, a3, …] ⋅ [b1, b2, b3, …] = [a1⋅b1, a2⋅b2, a3⋅b3, …]

Let’s now define < on the hypernaturals.

(a1, a2, a3, …) < (b1, b2, b3, …) if { k ∈ ℕ | ak = bk } ∈ U

The proof of transitivity in the previous post still works here. Now let’s prove that < is a total order.

Consider the following three sets:

X = { k ∈ ℕ | ak < bk }
Y = { k ∈ ℕ | ak > bk }
Z = { k ∈ ℕ | ak = bk }

The intersection of any pair of these sets is empty, meaning that at most one of them is in U. Could none of them be in U? Suppose X, Y, and Z are not in U. Then ℕ\X and ℕ\Y are in U. So (ℕ\X) ⋃ (ℕ\Y) is in U as well. But (ℕ\X) ⋃ (ℕ\Y) = Z! So Z is in U, contradicting our assumption.

So exactly one of these three sets is in U, meaning that a < b or b < a or a = b. This proves that using an ultrafilter really has fixed the problem we ran into previously. This problem was that the hypernaturals were quite different from the naturals in undesirable ways (like < not being a total order). The natural question to ask now is “Just how similar are the hypernaturals to the naturals?”

The answer is remarkable. It turns out that there are no first-order expressible differences between the naturals and the hypernaturals! Any first-order sentence that holds true of the natural numbers also holds true of the hypernatural numbers! This result is actually just one special case of an incredibly general result called Łoś’s theorem. And in the next post we are going to prove it!

Next up: Łoś’s theorem and ultraproducts!

# Hypernaturals Simplified (Ultra Series 2)

Previous: Introduction

What is a hypernatural number? It is a collection of infinitely long sequences of natural numbers. More precisely, it is an equivalence class of these infinite sequences.

An equivalence class under what equivalence relation? This is a little tricky to describe.

I’ll start with a slight lie to simplify the story. When we see the trouble that results from our simple definition, I will reveal the true nature of the equivalence relation that gives the hypernaturals. In the process you’ll see how the notion of an ultrafilter naturally arises.

So, hypernaturals are all about infinite sequences of natural numbers. Some examples of these:

(0,1,2,3,4,5,6,7,8,9,…)
(0,1,0,2,0,3,0,4,0,5,…)
(1,2,1,2,1,2,1,2,1,2,…)
(0,2,4,6,8,10,12,14,…)
(3,1,4,1,5,9,2,6,5,3,…)

We’ll define an equivalence relation ~ between sequences as follows:

Let x and y be infinite sequences of natural numbers.
Then x ~ y iff x and y agree in all but finitely many places.

For example, (0,1,2,3,4,5,6,…) ~ (19,1,2,3,4,5,6,…), because these two sequences only disagree at one spot (the zero index).

(1,1,2,2,4,4,…) and (1,2,4,8,…) are not equivalent, because these sequences disagree at infinitely many indices (every index besides the zeroth index).

Same with (0,1,2,3,4,5,6,…) and (1,2,3,4,5,6,7,…); even though they look similar, these sequences disagree everywhere.

(2,4,6,8,10,12,14,…) and (2,0,6,0,10,0,14,…) are not equivalent, because these sequences disagree at infinitely many indices (every odd index).

One can easily check that ~ is an equivalence relation, and thus it partitions the set of sequences of naturals into equivalence classes. We’ll denote the equivalence class of the sequence (a1, a2, a3, …) as [a1, a2, a3, …]. These equivalence classes are (our first stab at the definition of) the hypernaturals!

For instance, the equivalence class of (0,0,0,0,0,…) contains (1,4,2,0,0,0,0,0,…), as well as (0,2,4,19,0,0,0,0,…), and every other sequence that eventually agrees with (0,0,0,0,…) forever. So all of these correspond to the same hypernatural number: [0,0,0,0,…]. This object is our first hypernatural number! It is in fact the hypernatural number that corresponds exactly to the ordinary natural number 0. In other words 0 = [0,0,0,0,…].

[1,1,1,1,…] is a distinct equivalence class from [0,0,0,0,…]. After all, the sequences (0,0,0,0,…) and (1,1,1,1,…) disagree everywhere. You might guess that [1,1,1,1,…] is the hypernatural analogue to the natural number 1, and you’d be right!

For any standard natural number N, the corresponding hypernatural number is [N,N,N,N,N,…], the equivalence class of the sequence consisting entirely of Ns.

Now consider the hypernatural [0,1,2,3,4,5,6,…]. Let’s call it K. Does K = N for any standard natural number N? In other words, is (0,1,2,3,4,5,6,…) ~ (N,N,N,N,N,…) true for any finite N? No! Whatever N you choose, it will only agree with (0,1,2,3,4,5,6,…) at one location. We need cofinite agreement, and here we have merely finite agreement. Not good enough! This means that K is our first nonstandard natural number!

How does K relate to the standard naturals in terms of order? We haven’t talked about how to define < on the hypernaturals yet, but it’s much the same as our definition of =.

[a1, a2, a3, …] = [b1, b2, b3, …]
iff
{ k∈ℕ | ak = bk } is cofinite

[a1, a2, a3, …] < [b1, b2, b3, …]
iff
{ k∈ℕ | ak < bk } is cofinite

Exercise: Verify that this is in fact a well-defined relation. Every equivalence class has many different possible representatives; why does the choice of representatives not matter for the purpose of describing order?

Now we can see that K > N for every standard N. Look at (0,1,2,3,4,5,…) and (N,N,N,N,…). The elements of the first sequence are only less than the elements of the second sequence at the first N indices. Then the elements of K are greater than the elements of N forever. So elements of K’s representative sequence are greater than elements of N’s representative sequence in a cofinite set of indices. Thus, K > N for every standard N. So K is an infinitely large number!

Here’s another one: K’ = [0,2,4,6,8,…]. You can see that K’ > K, because the elements of K’ are greater than those of K at all but one index (the first one). So we have another, bigger, infinite number.

Addition and multiplication are defined elementwise, so

K + K
= [0,1,2,3,4,…] + [0,1,2,3,4,…]
= [0+0, 1+1, 2+2, 3+3, 4+4, …]
= [0,2,4,6,8,…]
= K’

K’
= [0,2,4,6,8,…]
= [2⋅0, 2⋅1, 2⋅2, 2⋅3, 2⋅4, …]
= 2⋅[0,1,2,3,4,…]
= 2⋅K

Predictably, we get many many infinities. In fact, there are continuum many nonstandard hypernatural numbers!

Proof: we construct an injection f from ℝ to *ℕ. If x is a real number, then f(x) := [floor(x), floor(10x), floor(100x), floor(1000x), …]. For example, f(35.23957…) = [35,352,3523,35239,352395, …]. For any two distinct reals x and y, the sequences x and y will eventually disagree forever. So each real is mapped to a distinct hypernatural, meaning that there are no more reals than hypernaturals. At the same time, there are no more hypernaturals than reals, because there are only continuum many countable sequences of natural numbers. So |*ℕ| = |ℝ|.

It turns out that every nonstandard hypernatural number is also larger than every standard natural number. We’ll see why in a bit, but it’ll take a bit of subtlety that I’ve yet to introduce.

Now, > is transitive in ℕ. Is it also transitive in *ℕ? Yes! Suppose A > B and B > C. Choose any representative sequences (a1, a2, a3, …), (b1, b2, b3, …), and (c1, c2, c3, …) for A, B, and C. Then X = { k∈ℕ | ak > bk } and Y = { k∈ℕ | bk > ck } are both cofinite. The intersection of cofinite sets is also cofinite, meaning that X⋂Y = { k∈ℕ | ak > bk and bk > ck } = { k∈ℕ | ak > ck } is cofinite. So A > C!

It’s a good sign that > is transitive. But unfortunately, the story I’ve told you thus far starts to break down here. The greater-than relation is a total order on the natural numbers. For any naturals a and b, exactly one of the following is true: a = b, a > b, b > a. But this is not true of the hypernaturals!

Consider the two hypernatural numbers n = [0,1,0,1,0,1,…] and m = [1,0,1,0,1,0,…]. Are n and m equal? Clearly not; they disagree everywhere. So n ≠ m.

Is n > m? No. The set of indices where n’s sequence is greater than m’s sequence is {1, 3, 5, 7, …}, which is not cofinite.

So is m > n? No! The set of indices where m’s sequence is greater than n’s sequence is {0, 2, 4, 6, …}, which is also not cofinite!

So as we’ve currently defined the hypernatural numbers, the > relation is not a total relation on them. This might be fine for some purposes, but we’ll be interested in defining the hypernaturals to mirror the properties of the naturals as closely as possible. So we’ll have to tweak our definition of the hypernaturals. The tweak will occur way back at the start where we defined our equivalence relation on sequences of naturals.

Recall: we said that two sequences (a1, a2, a3, …) and (b1, b2, b3, …), are equivalent if they agree in all but finitely many places. Said another way: a ~ b if { k∈ℕ | ak = bk } is cofinite. We defined > similarly: a > b if the agreement set for > is cofinite.

The problem with this definition was that it wasn’t definitive enough. There are cases where the agreement set is neither cofinite nor finite. (Like in our previous example, where the agreement set was the evens.) In such cases, our system gives us no direction as to whether a > b or b > a. We need a criterion that still deals with all the cofinite and finite cases appropriately, but also gives us a definitive answer in every other case. In other words, for ANY possible set X of indices of agreement, either X or X’s complement must be considered “large enough” to decide in its favor.

For example, maybe we say that if { k∈ℕ | ak = bk } = {0,2,4,6,8,…}, then a > b. Now our criterion for whether a > b is: the set of indices for which ak = bk is either cofinite OR it’s the evens. This implies that [1,0,1,0,1,0,…] > [0,1,0,1,0,1,…].

Once we’ve made this choice, consistency forces us to also accept other sets besides the evens as decisive. For instance, now compare (0,0,1,0,1,0,…) and (0,1,0,1,0,1,…). The set of indices where the first is greater than the second is {2,4,6,8,…}. But notice that the first differs only cofinitely from (1,0,1,0,…), meaning that [0,0,1,0,1,0,…] = [1,0,1,0,1,0,…]. The conclusion is that [0,0,1,0,1,0,…] > [0,1,0,1,0,1,…], which says that the set of indices {2,4,6,8,…} must also be decisive. And in general, once we’ve accepted the evens as a decisive set of indices, we must also accept the evens minus any finite set.

The three criterion we’ve seen as desirable for what sets of indices will count as decisive are (1) includes all cofinite sets, (2) for any set X, either X or X’s complement is decisive, and (3) consistency. These requirements turn out to correspond perfectly to a type of mathematical object called a free ultra-filter!

In the next post, we will define ultrafilters and finalize our definition of the hypernatural numbers!