Forcing and the Independence of CH (Part 2)

Part 1 here

Part 2: How big is 𝒫(Ο‰)?

Now the pieces are all in place to start applying forcing to prove some big results. Everything that follows assumes the existence of a countable transitive model M of ZFC.

First, a few notes on terminology.

The language of ZFC is very minimalistic. All it has on top of the first-order-logic connectives is a single binary relation symbol ∈. Nonetheless, people will usually talk about theorems of ZFC as if it has a much more elaborate syntax, involving symbols like βˆ… and Ο‰ and β„΅1 and terms like “bijection” and “ordinal”. For instance, the Continuum Hypothesis can be written as “There’s a bijection between 𝒫(Ο‰) and β„΅1“, which is obviously using much more than just ∈. But these terms are all simply shorthand for complicated phrases in the primitive language of ZFC: for instance a sentence “Ο†(βˆ…)” involving βˆ… could be translated as “βˆƒx (βˆ€y Β¬(y ∈ x) ∧ Ο†(x))”, or in other words “there’s some set x that is empty and Ο†(x) is true”.

Something interesting goes on when considering symbols like βˆ… and Ο‰ and β„΅1 that act like proper names. Each name picks out a unique set, but for some of these names, the set that gets picked out differs in different models of ZFC. For example, the interpretation of the name “β„΅1” is model-relative; it’s meant to be the first uncountable cardinal, but there are countable models of ZFC in which it’s actually only countably large! If this makes your head spin, it did the same for Skolem: you can find more reading here.

On the other hand, the name “βˆ…” always picks out the same set. In every model of ZFC, βˆ… is the unique set that contains nothing at all. When a name’s interpretation isn’t model-relative, it’s called absolute. Examples include “βˆ…” and “1,729”. When a name isn’t absolute, then we need to take care to distinguish between the name itself as a syntactic object, and the set which it refers to in a particular model of ZFC. So we’ll write “β„΅1” when referring to the syntactic description of the first uncountable cardinal, and β„΅1M when referring to the actual set that is picked out by this description in the model M.

With that said, let’s talk about a few names that we’ll be using throughout this post:

In M, Ο‰ is the conventional name for the set that matches the description β€œthe intersection of all inductive sets”, where inductive sets are those that contain the ordinal 0 and are closed under successor. In any transitive model of ZFC like M, Ο‰ is exactly β„•, the set of natural numbers. Since we’re restricting ourselves to only considering transitive models, we can treat Ο‰ as if it’s unambiguous; its interpretation won’t actually vary across the models we’re interested in.

Ο‰1 corresponds to the description “the smallest uncountable ordinal”. This description happens to perfectly coincide with the description “the first uncountable cardinal” in ZFC, which has the name β„΅1. But unlike with Ο‰, different transitive models pick out different sets for Ο‰1 and β„΅1. So in a model M, we’ll write Ο‰1M for the set that M believes to be the smallest uncountable ordinal and β„΅1M for the first uncountable cardinal.

Ο‰2 is the name for the first ordinal for which there’s no injection into Ο‰1 (i.e. the first ordinal that’s larger than Ο‰1). Again, this is not absolute: Ο‰2M depends on M (although again, Ο‰2M is the same as β„΅2M no matter what M is). And in a countable model like those we’ll be working with, Ο‰2M is countable. This will turn out to be very important!

𝒫(Ο‰) is the name for the the set that matches the description “the set of all subsets of Ο‰”. Perhaps predictably at this point, this is not absolute either. So we’ll have to write 𝒫(Ο‰)M when referring to M’s version of 𝒫(Ο‰).

Finally, ZFC can prove that there’s a bijection from 𝒫(Ο‰) to ℝ, meaning that this bijection exists in every model. Thus anything we prove about the size of 𝒫(Ο‰) can be carried over to a statement about the size of ℝ. Model-theoretically, we can say that in every model M, |𝒫(Ο‰)M| = |ℝM|. It will turn out to be more natural to prove things about 𝒫(Ο‰) than ℝ.

Okay, we’re ready to make the continuum hypothesis false! We’ll do this by choosing a partially ordered set (P, ≀) whose extension as a Boolean algebra (B, ≀) has the property that no matter what M-generic filter G in B you choose, M[G] will interpret its existence as proof that |𝒫(Ο‰)| β‰₯ β„΅2.

Making |𝒫(Ο‰)| β‰₯ β„΅2

Let P = { f ∈ M | f is a finite partial function from Ο‰2M Γ— Ο‰ to {0,1} }. Let’s have some examples of elements of P. The simplest is the empty function βˆ…. More complicated is the function {((Ο‰+1, 13), 1)}. This function’s domain has just one element, the ordered pair (Ο‰+1, 13), and this element is mapped to 1. The function {((14, 2), 0), ((Ο‰1Mβ€’2, 2), 1)} is defined on two elements: (14, 2) and (Ο‰1Mβ€’2, 2). And so on.

We’ll order P by reverse inclusion: f ≀ g iff f βŠ‡ g. Intuitively, f ≀ g if f is a function extension of g: f is defined everywhere that g is and they agree in those places. For instance, say f = {((Ο‰, 12), 0)}, g = {((Ο‰, 12), 0), ((13, 5), 1)}, h = {((Ο‰, 12), 1)}, and p = {((Ο‰, 11), 1)}. Check your understanding by verifying that (1) g ≀ f, (2) f and h are incomparable and have no common lower bound, and (3) f and p are incomparable but do have a common lower bound (what is it?).

The empty function βˆ… is a subset of every function in P, meaning that βˆ… is bigger than all functions. Thus βˆ… is the top element of this partial order. And every f in P is finite, allowing a nice visualization what the partial order (P, ≀) looks like:

Now, G will be an M-generic ultrafilter in the Boolean extension (B, ≀) of (P, ≀). G is the set that we’re ultimately adding onto M, so we want to know some of its properties. In particular, how will we use G to show that |𝒫(Ο‰)| = β„΅2?

What we’re going to do is construct a new set out of G as follows: first take the intersection of P with G. (remember that G is an ultrafilter in B, so it doesn’t only contain elements of P). Pβ‹‚G will be some set of finite partial functions from Ο‰2M Γ— Ο‰ to {0,1}. We’ll take the union of all these functions, and call the resulting set F. What we’ll prove of F is the following:

(1) F := ⋃(Pβ‹‚G) is a total function from Ο‰2M Γ— Ο‰ to {0,1}
(2) For every Ξ±, Ξ² ∈ Ο‰2M, F(Ξ±, β€’) is a distinct function from F(Ξ², β€’).

A word on the second clause: F takes as input two things: an element of Ο‰2M and an element of Ο‰. If we only give F its first input, then it becomes a function from Ο‰ to {0,1}. For Ξ± ∈ Ο‰2M, we’ll give the name FΞ± to the function we get by feeding Ξ± to F. Our second clause says that all of these functions are pairwise distinct.

Now, the crucial insight is that each FΞ± corresponds to some subset of Ο‰, namely {n ∈ Ο‰ | FΞ±(n) = 1}. So F defines |Ο‰2M|-many distinct subsets of Ο‰. So in M[G], it comes out as true that |𝒫(Ο‰)M| β‰₯ |Ο‰2M|. It’s also true in M[G] that |Ο‰2M| = β„΅2M, and so we get that |𝒫(Ο‰)M| β‰  β„΅1M. This is Β¬CH!

Well… almost. There’s one final subtlety: |𝒫(Ο‰)M| β‰  β„΅1M is what Β¬CH looks like in the model M. For Β¬CH to be true in M[G], it must be that that |𝒫(Ο‰)M[G]| β‰  β„΅1M[G]. So what if |𝒫(Ο‰)M| β‰  |𝒫(Ο‰)M[G]|, or β„΅1M β‰ Β β„΅1M[G]? This would throw a wrench into our proof: it would mean that M[G] believes that M’s version of 𝒫(Ο‰) is bigger than M’s version of β„΅1, but M[G] might not believe that its own version of 𝒫(Ο‰) is bigger than its version of β„΅1. This is the subject of cardinal collapse, which I will not be going into. However, M[G] does in fact believe that 𝒫(Ο‰)M = 𝒫(Ο‰)M[G] and that β„΅1M =Β β„΅1M[G].

Alright, so now all we need to do to show that M[G] believes Β¬CH is to prove (1) that F is a total function from Ο‰2M Γ— Ο‰ to {0,1}, and (2) that for every Ξ±, Ξ² ∈ Ο‰2M, FΞ± is distinct from FΞ². We do this in three steps:

  1. F is a function.
  2. F is total.
  3. For every Ξ±, Ξ² ∈ Ο‰2M, FΞ± is distinct from FΞ².

Alright so how do we know that F is a function? Remember that F = ⋃(Pβ‹‚G). What if some of the partial functions in Pβ‹‚G are incompatible with each other? In this case, their union cannot be a function. So to prove step 1 we need to prove that all the functions in Pβ‹‚G are compatible. This follows easily from two facts: that G is an ultrafilter in B and P is dense in B. The argument: G is an ultrafilter, so for any f, g ∈ P there’s some element f∧g ∈ B that’s below both f and g. All we know about f∧g is that it’s an element of B, but we have no guarantee that it’s also a member of P, our set of partial functions. In other words, we can’t say for sure that f∧g is actually a partial function from Ο‰2M Γ— Ο‰ to {0,1}. But now we use the fact that P is dense in B! By the definition of density, since f∧g is an element of B, P must contain some h ≀ f∧g. Now by transitivity, h ≀ f and h ≀ g. So f and g have a common function extension, meaning they must be compatible functions! Pretty magical right?

So F is a function. But how do we know that it’s total? We prove this by looking closely at the dense subsets of P. In particular, for any Ξ± ∈ Ο‰2M and any n ∈ Ο‰, define DΞ±,n := {f ∈ P | (Ξ±, n) ∈ dom(f)}. This is a dense subset of P. Why? Well, regardless of what Ξ± and n are, any function f in P is either already defined on (Ξ±, n), in which case we’re done, or it’s not, in which case f has a function extension with (Ξ±, n) in its domain. So from any element of P, you can follow the order downwards until you find a function with (Ξ±, n) in its domain. Since DΞ±,n is dense in P and G is M-generic, G must have some element in common with DΞ±,n. Thus G contains an element f of P that has (Ξ±, n) in its domain. This f is one of the functions that we union over to get F, so F must have (Ξ±, n) in its domain as well! And since Ξ± and n were totally arbitrary, F must be total.

Finally, why are the β€œFα”s pairwise distinct? Again we construct dense subsets for our purposes: for any Ξ±, Ξ² ∈ Ο‰2M, define DΞ±,Ξ² := {f ∈ P | fΞ± β‰  fΞ²}. This is clearly dense (you can always extend any f in P to make fΞ± and fΞ² disagree somewhere). So G contains an element f of P for which fΞ± β‰  fΞ². And thus it must also be true of F that FΞ± β‰  FΞ²!

And we’re done! We’ve shown that once we add G to M, we can construct a new set F = ⋃(Pβ‹‚G) such that F encodes |Ο‰2M|-many distinct subsets of Ο‰, and thus that M[G] ⊨ |𝒫(Ο‰)| β‰₯ β„΅2.

Making |𝒫(Ο‰)| β‰₯Β β„΅420

What’s great is that this argument barely relied on the β€œ2” in Ο‰2M. We could just as easily have started with P = {f ∈ M | f is a finite partial function from Ο‰420M Γ— Ο‰ to {0,1}}, constructed an M-generic filter G in the Boolean extension of P, then defined F to be ⋃(Pβ‹‚G).

G is still an ultrafilter in B and P is still dense in B, so F is still a function.

DΞ±,n := {f ∈ P | (Ξ±, n) ∈ dom(f)} is still a dense subset of P for any Ξ± ∈ Ο‰420M and any n ∈ Ο‰, so F is still total.

And DΞ±,Ξ² := {f ∈ P | fΞ± β‰  fΞ²} is still a dense subset of P for any Ξ±, Ξ² ∈ Ο‰420M, so all the β€œFα”s are pairwise distinct.

And now F encodes |Ο‰420M|-many distinct subsets of Ο‰! The final step, which I’m going to skip over again, is showing that M[G] believes that |Ο‰420M| = |Ο‰420M[G]|, i.e. that cardinal collapse doesn’t occur.

And there we have it: this choice of P gives us a new model M[G] of ZFC that believes that |𝒫(Ο‰)| β‰₯Β β„΅420! Note how easy and quick this argument is now that we’ve gone through the argument for how to make M[G] believe |𝒫(Ο‰)| β‰₯ β„΅2. This is a great thing about forcing: once you really understand one application, other applications become immensely simpler and easier to understand.

Making |𝒫(Ο‰)| =Β β„΅1

Okay, we’ve made CH false. Now let’s make it true! This time our choice for (P, ≀) will be the following:

P = {f ∈ M | f is an M-countable partial function from Ο‰1M to 𝒫(Ο‰)M}. Once more we order by reverse inclusion: f ≀ g iff f βŠ‡ g.

A crucial thing to notice is that I’ve merely required the functions in P to be M-countable, rather than countable. For a set to be M-countable is for there to exist an injection in M from set to Ο‰. Any M-countable set is countable, but some countable sets may not be M-countable (like M itself!). In particular, we know that Ο‰1M is actually a countable set, meaning that if we required true countability instead of just M-countability, then P would include some total functions! But M has no injection from Ο‰1M to Ο‰, so no function in P can be total. This will be important in a moment!

We get our M-generic ultrafilter G and define F := β‹‚(Pβ‹‚G). Now we want to show that F is a total and surjective function from Ο‰1M to 𝒫(Ο‰). As before, we proceed in three steps:

  1. F is a function
  2. F is total
  3. F is surjective

Step 1: G is an ultrafilter and P is dense in G, so the same argument works to show that all elements of Pβ‹‚G are compatible: (1) ultrafilter implies that any f, g in Pβ‹‚G have a least upper bound f∧g in B, (2) density of P in G implies that f∧g has a lower bound h in P, so (3) f and g have a common function extension . Thus F is a function.

Step 2: DΞ± := {f ∈ P | Ξ± ∈ dom(f)} is dense in P for any Ξ± ∈ Ο‰1M, so F is total.

Step 3: DA := {f ∈ P | A ∈ image(f)} is dense in P for any A ∈ 𝒫(Ο‰)M. Why? No f in P is total, so any f in P can be extended by adding one more point (Ξ±, A) where Ξ± βˆ‰ dom(f). Thus A ∈ image(F) for any A ∈ 𝒫(Ο‰), so F is surjective.

So F is a total surjective function from Ο‰1M to 𝒫(Ο‰)M, meaning that in M[G] it’s true that |𝒫(Ο‰)M| ≀ β„΅1M. And since ZFC proves that |𝒫(Ο‰)| > β„΅0, it follows that M[G] ⊨ |𝒫(Ο‰)| = β„΅1.

You might have noticed that each of these arguments could just as easily have been made if we had started out with defining P as the finite partial functions in M from Ο‰1M to 𝒫(Ο‰)M. This is true! If we had done this, then we still would have ended up proving that F was a total surjective function from Ο‰1M to 𝒫(Ο‰)M, and therefore that |𝒫(Ο‰)M| = β„΅1M. However, this is where cardinal collapse rears its ugly head: the final step is to prove that |𝒫(Ο‰)M| = |𝒫(Ο‰)M[G]| and that β„΅1M = β„΅1M[G]. This requires that P be the countable partial functions rather than just the finite ones.

And with that final caveat, we see that M[G] believes that |𝒫(Ο‰)| = β„΅1.

Making |𝒫(Ο‰)| ≀ ℡314

Let P = {f ∈ M | f is an M-countable partial function from Ο‰314M to 𝒫(Ο‰)} and order by reverse inclusion: f ≀ g iff f βŠ‡ g.

β‹‚(Pβ‹‚G) is again a total and surjective function from Ο‰314M to 𝒫(Ο‰), following the exact same arguments as in the previous section. Cardinal collapse doesn’t occur here either, so M[G] believes that |𝒫(Ο‰)| ≀ β„΅314.

✯✯✯

We’ve proven the independence of the Continuum Hypothesis from ZFC! We’ve done more: we’ve also shown how to construct models of ZFC in which we have lots of control over the size of 𝒫(Ο‰). Want a model of ZFC in which |𝒫(Ο‰)| is exactly β„΅1729? Just force with the right P and you’re good to go! There’s something a bit unsatisfactory about all this though, which is that in each application of forcing we’ve had to skim over some details about cardinal collapse that were absolutely essential to the proof going through. I hope to go into these details in a future post to close these last remaining gaps.

One thought on “Forcing and the Independence of CH (Part 2)

Leave a comment