Ultraproducts and Łoś’s Theorem (Ultra Series 4)

Previous: Hypernaturals in all their glory

First things first, you’re probably asking yourself… how is Łoś pronounced?? I’m not the most knowledgeable when it comes to Polish pronunciation, but from what I’ve seen the Ł is like a “w”, the o is like the vowel in “thought”, and the ś is like “sh”. So it’s something like “wash”. (I think.)

Ok, on to the math! This is probably going to be the hardest post in this series, so I encourage you to read through it slowly and not give up if you start getting lost. Try to work out some examples for yourself; that’s often the best way to get a grasp on an abstract concept. (In general, my biggest tip for somebody starting to dive into serious mathematical content is to not read it like fiction! In fiction, short sentences can safely be read quickly. But math is a language in which complex ideas can be expressed very compactly. So fight your natural urge to rush through short technical sentences, and don’t feel bad about taking your time in parsing them!)

What the heck is an ultraproduct

Last post we defined an ultrafilter. Now let’s define an ultraproduct.

It turns out we’ve already seen an example of an ultraproduct: the hypernaturals. The hyperreals (denoted *ℝ) are another famous example of an ultraproduct, constructed from ℝ in exactly the same way as we obtained the hypernaturals *ℕ from ℕ. For any structure M whatsoever, there exists a “hyperstructure” *M obtained from M via an ultraproduct. So what exactly is an ultraproduct?

We’ll build up to it in a series of six steps.

(1) Choose an index set I.
(2) Choose a first-order language L and a family of L-structures indexed by I (Mi)i∈I.
(3) Define sequences of elements of these structures.
(4) Define an equivalence relation between these sequences.
(5) Construct the set of equivalence classes under this equivalence relation.
(6) Define the interpretations of the symbols of L in this set.

Let’s get started!

(1) First we select an index set I. This will be the set of indices we use when constructing our sequences of elements. If we want our sequences to be countably infinite, then we choose I = ℕ.

(2) Now, fix some first-order language L = <constants, relations, functions>. Consider any family of L-structures, indexed by I: (Mi)i∈I. For instance, if I = ℕ and L is the language of group theory <{e}, ∅, {⋅}>, then our family of L-structures might be (ℤ1, ℤ2, ℤ3, ℤ4, …), where ℤk is the integers-mod-k (i.e. the cyclic group of size k).

(3) Now we consider sequences of the elements of these structures. For instance, hyperreals are built from sequences of real numbers that look like (a0, a1, a2, a3, …). The set of indices used here is {0, 1, 2, 3, …}, i.e. ℕ.

If our index set isn’t countable, then it’s not possible anymore to visualize sequences like this. For instance, if I = ℝ, then our sequence will have an element for every real number. A more general way to formulate sequences is as maps from the index set I to the elements of the component models. The hyperreal sequence (a0, a1, a2, a3, …) can be thought of as a map a: ℕ → ℝ, where a(n) = an for each n ∈ ℕ.

In general, a sequence will be defined as follows:

f: I → U(Mi)i∈I such that f(i) ∈ Mi for each i ∈ I.

In the ℤk example, a sequence might look like (0, 1, 2, 3, 4, …). Note that 0 ∈ ℤ1, 1 ∈ ℤ2, 2 ∈ ℤ3, and so on. On the other hand (2, 1, 2, 3, 4, …) would not be a valid sequence, because there’s no 2 in ℤ1.

Now, consider the set of all sequences: { f: I → U(Mi)i∈I | f(i) ∈ Mi for each i ∈ I }. This set is called the direct product of (Mi)i∈I and is denoted Π(Mi)i∈I.

(4) We want to construct an equivalence relation on this set. We do so by first defining a free ultrafilter U on I. From the previous post, we know that every infinite set has a free ultrafilter on it, so as long as our index set is infinite, then we’re good to go.

With U in hand, we define the equivalence relation on Π(Mi)i∈I:

Let f and g be sequences (f,g: I → U(Mi)i∈I).
Then f ~ g if and only if { i ∈ I | f(i) = g(i) } ∈ U

You might be getting major deja-vu from the last post. Two sequences are said to be equivalent if the set of places-of-agreement is a member of the chosen ultrafilter. Since every free ultrafilter contains all cofinite sets, any two sequences that agree in all but finitely many places will be equivalent.

(5) Now all the pieces are in place. The ultraproduct of (Mi)i∈I with respect to U is the set of equivalence classes of Π(Mi)i∈I with respect to ~. This is typically written Π(Mi)i∈I/U. For a sequence a: I → U(Mi)i∈I we’ll denote the equivalence class it belongs to as [a].

Now, it’s not necessary for our indexed sequence (Mi)i∈I of L-structures to all be distinct. In fact, all the models can be the same, in which case we have (Mi)i∈I = (M)i∈I, and the ultraproduct Π(M)i∈I/U is called an ultrapower. The ultrapower of M with index set I and ultrafilter U can be written compactly as MI/U.

Some examples: The hypernaturals are the ultrapower ℕ/U = Π(ℕ)i∈ℕ/U where U is any free ultrafilter over ℕ. Similarly, the hyperreals are ℝ/U = Π(ℝ)i∈ℕ/U. The hyperintegers are ℤ/U. And so on.

(6) So far we’ve just defined the ultraproduct as a set (the set of equivalence classes of I-indexed sequences of elements from the models (Mi)i∈I). But we want the ultraproduct to have all the same structure as the models that we used as input. In other words, the ultraproduct of a bunch of L-structures will itself be an L-structure. To make this happen, we need to specify how the relation symbols and function symbols of L work in the ultraproduct model.

Here’s how it works. If f is a unary function symbol in the language L, then we define f on Π(Mi)i∈I/U by applying the function elementwise to the sequences. So:

f: Π(Mi)i∈I/U → Π(Mi)i∈I/U is defined as f([a])(i) = f(a(i)) for every i ∈ I.

What if f is a binary function symbol? Then:

f: (Π(Mi)i∈I/U)2 → Π(Mi)i∈I/U is defined as f([a], [b])(i) = f(a(i), b(i)) for every i ∈ I.

This generalizes in the obvious way to trinary function symbols, quaternary function symbols, and so on.

What about relations? Suppose R is a unary relation symbol in the language L. We need to define R on the ultraproduct Π(Mi)i∈I/U, and we do it as follows:

Π(Mi)i∈I/U ⊨ R([a]) if and only if { i ∈ I | Mi ⊨ R(a(i)) } ∈ U.

In other words, Π(Mi)i∈I/U affirms R([a]) if and only if the set of indices i such that Mi affirms R(a(i)) is in the ultrafilter. For example, if R holds for cofinitely many members of the sequence a, then R holds of [a].

If R is binary, we define it as follows:

Π(Mi)i∈I/U ⊨ R([a], [b]) if and only if { i ∈ I | Mi ⊨ R(a(i), b(i)) } ∈ U.

And again, this generalizes in the obvious way.

This fully defines the ultraproduct Π(Mi)i∈I/U as an L-structure! (If you’re thinking ‘what about constant symbols?’, remember that constants are just 0-ary functions)

Say that again, slower

That was really abstract, so let’s go through it again with the (hopefully now-familiar) example of the hypernaturals.

We start by defining an index set I. We choose I = ℕ.

Now define the language we’ll use. This will be the standard language of Peano arithmetic: one constant symbol (0), one relation symbol (<), and three function symbols (S, +, ×).

The family of structures in this language that we’ll consider (Mi)i∈I will just be a single structure repeated: for each i, Mi will be the L-structure ℕ (the natural numbers with 0, <, +, and × defined on it). So our family of structures is just (ℕ)i∈ℕ = (ℕ, ℕ, ℕ, ℕ, …).

Our sequences are functions from I to U(Mi)i∈I such that for each i∈I, f(i) ∈ Mi. For the hypernaturals, I and U(ℕ)i∈ℕ are both just ℕ, so our sequences are functions from ℕ to ℕ. We can represent the function f: ℕ → ℕ in the familiar way: (f(0), f(1), f(2), f(3), …).

The set of all sequences is the set of all functions from ℕ to ℕ. This is the direct product Π(ℕ)i∈ℕ = ℕ.

Now, we take any ultrafilter on I = ℕ. Call it U. We use U to define the equivalence relation on the direct product ℕ:

(a(0), a(1), a(2), …) ~ (b(0), b(1), b(2), …) if and only if { i ∈ ℕ | a(i) = b(i) } ∈ U

And taking equivalence classes of this relation, we’ve recovered our original definition of the hypernatural numbers! ℕ/U = *ℕ. Now we finish up by defining all functions and relations on *ℕ.

Functions are defined pointwise:

0 = [0, 0, 0, 0, …]
S[a(0), a(1), a(2), …] = [Sa(0), Sa(1), Sa(2), …]
[a(0), a(1), a(2), …] + [b(0), b(1), b(2), …] = [a(0) + b(0), a(1) + b(1), a(2) + b(2), …]
[a(0), a(1), a(2), …] ⋅ [b(0), b(1), b(2), …] = [a(0) ⋅ b(0), a(1) ⋅ b(1), a(2) ⋅ b(2), …]

We just have one relation symbol <, and relations are defined according to the ultrafilter:

[a(0), a(1), a(2), …] < [b(0), b(1), b(2), …] iff { i ∈ ℕ | ℕ ⊨ (a(i) < b(i)) } ∈ U

And we’re done!

Łoś’s theorem

Now we’re ready to prove Łoś’s theorem in its full generality. First, let’s state the result:

Fix any index set I, any language L, and any family of L-structures (Mi)i∈I. Choose a free ultrafilter U on I and construct the ultraproduct structure Π(Mi)i∈I/U. Łoś’s theorem says:

For every L-sentence φ, Π(Mi)i∈I/U ⊨ φ if and only if { i ∈ I | Mi ⊨ φ } ∈ U.

A special case of this is where our ultraproduct is an ultrapower of M, in which case it reduces to:

For every L-sentence φ, MI/U ⊨ φ if and only if M ⊨ φ

In other words, any ultrapower of M is elementary equivalent to M!

The proof is by induction on the set of all L-formulas.

Base case: φ is atomic

Atomic sentences are either of the form R(t1, …, tn) or (t1 = t2) for an n-ary relation symbol R and terms t1, … tn.

Suppose φ is R(t1, …, tn). This case is easy: it was literally the way we defined the interpretation of relation symbols in the ultraproduct model that Π(Mi)i∈I/U ⊨ R(t1, …, tn) if and only if {i ∈ I | Mi ⊨ R(t1, …, tn)} ∈ U.

Suppose φ is (t1 = t2). t1 and t2 are terms, so the ultraproduct model (Π(Mi)i∈I/U) interprets them as I-sequences, i.e. functions from I to U(Mi)i∈I such that t1(i) and t2(i) are both in Mi. We’ll write the denotations of t1 and t2 as [t1] and [t2]. Now, [t1] = [t2] iff { i ∈ I | t1 = t2 } ∈ U iff { i ∈ I | Mi ⊨ (t1 = t2) } ∈ U, which is what we want.

Inductive step: φ is ¬ψ, ψ∧θ, or ∃x ψ

Assume that Los’s theorem holds for ψ and θ. Now we must show that it holds for φ

Suppose φ is ¬ψ.

Then Π(Mi)i∈I/U ⊨ φ
iff Π(Mi)i∈I/U ⊭ ψ
iff { i ∈ I | Mi ⊨ ψ } ∉ U (by the inductive hypothesis)
iff { i ∈ I | Mi ⊨ ψ }c ∈ U (by the ultra property of U)
iff { i ∈ I | Mi ⊭ ψ } ∈ U

iff { i ∈ I | Mi ⊨ ¬ψ } ∈ U
iff { i ∈ I | Mi ⊨ φ } ∈ U

Suppose φ is ψ∧θ.

Then Π(Mi)i∈I/U ⊨ φ
iff Π(Mi)i∈I/U ⊨ ψ and Π(Mi)i∈I/U ⊨ θ
iff { i ∈ I | Mi ⊨ ψ } ∈ U and { i ∈ I | Mi ⊨ θ } ∈ U (by the inductive hypothesis)
iff { i ∈ I | Mi ⊨ ψ } ⋂ { i ∈ I | Mi ⊨ θ } ∈ U (by closure-under- of U)
iff { i ∈ I | Mi ⊨ ψ and Mi ⊨ θ } ∈ U
iff { i ∈ I | Mi ⊨ ψ∧θ } ∈ U
iff { i ∈ I | Mi ⊨ φ } ∈ U

Suppose φ is ∃x ψ.

Then Π(Mi)i∈I/U ⊨ φ
iff Π(Mi)i∈I/U ⊨ ∃x ψ
iff Π(Mi)i∈I/U ⊨ ψ(a) for some [a] ∈ Π(Mi)i∈I/U
iff { i ∈ I | Mi ⊨ ψ(a(i)) } ∈ U (by the inductive hypothesis)
iff { i ∈ I | Mi ⊨ ∃x ψ(x) } ∈ U

And that completes the proof! We don’t need to consider ∨, →, ↔, or ∀, because these can all be defined in terms of ¬, ∧, and ∃.

Now we know that the first-order properties of ultraproducts are tied closely to those of their component structures. The ultraproduct of any collection of two-element structures is itself a two-element structure. Same with the ultraproduct of any collection of structures, cofinitely many of which are two-element structures!

The ultraproduct of any collection of PA models is itself a PA model. The ultraproduct of any collection of groups is itself a group. But the ultraproduct of all finite groups need not itself be finite, because “I am finite” isn’t first-order expressible.

And in particular, an ultrapower of a structure M perfectly mimics ALL of the first-order properties of M!

Łoś’s theorem is an incredibly powerful tool we can wield to illuminate the strange structure of the hypernatural numbers. We’re now positioned to discover nonstandard prime numbers, infinitely even numbers, numbers that are divisible by every standard natural number, and infinitely large prime gaps. All of this (and more) in the next post!

Next: Weird nonstandard numbers