The Ultra Series: Guide

Assumed background knowledge: basic set theory lingo (∅, singleton, subset, power set, cardinality), what is first order logic (structures, universes, and interpretations), what are ℕ and ℝ, what’s the difference between countable and uncountable infinities, and what “continuum many” means.

1 Introduction
Here I give a high-level description of what an ultraproduct is, and provide a few examples. Skippable if you want to jump straight to the math!

2 Hypernaturals Simplified
Here you get a first glimpse of the hypernaturals. It’s a fuzzy glimpse from afar, and our first attempt to define them is overly simplified and imperfect. Nonetheless, we get some good intuitions for how hypernatural numbers are structured, before eventually confronting the problem at the core of the definition.

3 Hypernaturals in all their glory
We draw some pretty pictures and introduce the concept of an ultrafilter. The concept is put to work immediately, allowing us to give a full definition of the hypernaturals with no simplifications. The issues with the previous definition have now been patched, and the hypernaturals are a well-defined structure ripe to be explored.

4 Ultraproducts and Łoś’s theorem
We describe how to pronounce “Łoś”, define what an ultraproduct is, and see how the hypernaturals are actually just the ultraproduct of the naturals. And then we prove Łoś’s theorem!

5 Infinitely Large Primes
With the newfound power of Łoś’s theorem at our hands, we return to the realm of the hypernaturals and start exploring its structure. We describe some infinitely large prime numbers, and prove that there are infinitely many of them. We find more strange infinitely large hypernatural numbers in our exploration: numbers that can be divided by 2 ad infinitum, numbers that are divisible by every finite number, and more. We learn that there’s a subset of the hypernaturals that is arranged just like the positive rational numbers, but that the hypernaturals are not dense.

6 Ultraproducts and Compactness
We zoom out from the hypernaturals, and show that ultraproducts can be used to give the prettiest proof of the compactness theorem for first order logic. We prove it first for countable theories, and then for all theories. We then get a little wild and discuss some meta-logical results involving ultraproducts, definability, and compactness.

7 All About Countable Saturation
We now describe the most powerful property of ultraproducts: countable saturation. And then we prove it! With our new tool, we dive back into the hypernaturals to learn more about their structure. We show that for any countable set of hypernaturals, there’s a hypernaturals that’s divisible by them all, and see that this entails the existence of uncountably many hypernatural primes. We prove that the hypernaturals have uncountable cofinality and coinitiality. And from this we see that no two hypernaturals are countably infinitely far apart; all distances are finite or uncountable! We wrap up with a quick proof that ultraproducts are always either finite or uncountable, and a mind-blowing result that relates ultraproducts to the continuum hypothesis.

7.5 Shorter Proof of Countable Saturation
I give a significantly shorter and conceptually simpler proof of countable saturation than the previous post. Then I wax philosophical for a few minutes about constructivism in the context of ultraproduct-related proofs.

No more than half of anti-sets are big

Background on anti-set theory here.

Suppose that somebody tells you that they have in their hands a model of anti-set theory containing 50 anti-sets. The axiom of anti-infinity gives us some nice immediate restrictions on how large these anti-sets can be. As a quick reminder, the axiom of anti-infinity tells us that every anti-set X must contain another anti-set Y that has a successor that’s not in X. The successor of X is defined to be X ⋃ {X} (an anti-set containing everything in X plus X itself). One big difference between successors in ZFC and successors in anti-set-theory is that anti-sets aren’t always guaranteed to have successors. Another is that anti-sets can be their own successors (this happens if the anti-set contains itself).

Ok, so anti-infinity immediately tells us that there can be no universal set (i.e. a set containing the entire universe). Since every set X must contain a set that has a successor outside X, no set can contain everything, or else it would contain all the successors!

So there are no anti-sets of size 50. How about anti-sets of size 49? It’s recently been proved that there can only be at most 25 sets of size 49. And in general, in a universe of size N no more than half of the anti-sets can be size N-1. Let’s prove it!

First, a naming convention: in a universe of size N, let’s call any sets of size N-1 big sets, and any other sets small sets. Big sets are as big as an anti-set can be. To satisfy anti-infinity, any set X must contain an element Y that has a successor that’s not in X. We’ll call Y the anti-infinity-satisfier for X.

Now, some observations. Suppose that a big set X has a successor. This successor is either size N-1 (if X contains itself) or size N (if not). But no set is size N! So any big set that has a successor, must be its own successor. But this means that big sets cannot be anti-infinity-satisfiers. If a big set X was an anti-infinity-satisfier for some other set Y, then its successor could not be in Y. But its successor is X, and X is in Y!

What this means is that every set must contain a small set as its anti-infinity-satisfier. Now, consider any big set X. X must contain some small set z that serves as an anti-infinity-satisfier for it. But for z to serve as an anti-infinity-satisfier, z must have a successor that’s not in X. What this means is that X must be missing z’s successor.

In other words, every big set must be missing at least one small set’s successor. But remember, big sets are size N-1. This means that they contain every set in the universe besides one. So now we know that the “missing set” that characterizes each big set is always a successor of a small set!

Let’s say that our universe contains k small sets. Then there are at most k successors of small sets. Any big set must be the set of all sets besides one of these successors. Thus by extensionality, there are at most k big sets. Therefore there can never be more big sets than small sets!

And that’s the result: In any finite universe, at most half of the anti-sets are big.

This resolves a conjecture that I posted earlier as an open question: in an N-element model, are there always at least three small sets? The answer is yes. If we had fewer than three small sets, then we’d have to also have fewer than three big sets. This means that our universe is at most size four (two small sets and two big sets). But no four-element models exist! QED.

Epsilon-induction and the cumulative hierarchy

The axiom of foundation says that every non-empty set must contain an element that it is disjoint with. One implication of this is that no set can contain itself. (Why? Suppose ∃x (x ∈ x). Then you can pair x with itself to form {x}. {x} is non-empty, so it must contain an element that it’s disjoint with. But its only element is x, and that element is not disjoint with {x}. In particular, they both have as an element x.)

Another implication of foundation is that there can be no infinite descending sequence of sets. A sequence of sets is a function f whose domain is ω. For each k ∈ ω we write f(k) as fk. Suppose that f is a function with the property that for each n ∈ ω, fn+1 ∈ fn. Then by the axiom of replacement, there would exist the set {f0, f1, f2, …}. Each element of this set contains the following element, so none of the elements are disjoint with the entire set. So the sequence we started with could not exist.

This allows us to prove a pretty powerful and surprising result about set theory. The result: For any sentence Φ, if Φ holds of at least one set, then there must exist a set X such that Φ(X) holds, but Φ doesn’t hold for any element of X.

Formally:

(∃A Φ(A)) → ∃X (Φ(X) ∧ ∀Y ∈ X (¬Φ(Y))).

Suppose this was false. Then Φ would be satisfied by at least one set, and every set that satisfied Φ would contain an element that satisfied Φ. We can use this to build an infinite descending sequence of sets. Take any set that satisfies Φ and call it X0. Since X0 satisfies Φ, it must contain at least one element that satisfies Φ. Define X1 to be any such element. X1 satisfies Φ as well, so it must contain at least one element that satisfies Φ, which we’ll call X2. And so on forever. Since no set can contain itself, Xn+1 is always distinct from Xn. We can use recursion and then replacement to construct the set {X0, X1, X2, …}, which is an infinite descending sequence of sets. Contradiction!

This peculiar principle turns out to be useful to prove some big results, and the proofs are always a little funny and enjoyable to me. I’m not aware of any name for it, so I’ll call it the “far-from-the-tree” principle: for any property that holds of at least one set, there’s a set that has that property, none of whose elements have the property. I’ll now show two big results that are proven by this principle.

Everything’s in the Cumulative Hierarchy

The cumulative hierarchy is constructed recursively as follows:

V0 = ∅
Vα+1 = 𝒫(Vα), for each ordinal α
Vλ = U{Vβ : β < λ}, for each limit ordinal λ

V0 = ∅
V1 = {∅}
V2 = {∅, {∅}}
V3 = {∅, {∅}, {{∅}}, {∅, {∅}}}

One can prove that Vα is transitive for each ordinal α (in other words, that every element of Vα is also a subset of Vα). Also, due to the nature of the power-set operation, for any two ordinals α < β, Vα ⊆ Vβ.

Now, the reason the cumulative hierarchy is interesting is that we can prove that every set is on some level of the hierarchy. We do so with the far-from-the-tree principle:

Suppose that there was a set X that wasn’t on the hierarchy. This means that ∀α, X ∉ Vα. We use the far-from-tree principle with the property of “not being on the hierarchy” to conclude that there must be a set Y such that Y is not on the hierarchy, but every element of Y is on the hierarchy. So for each element y ∈ Y there’s some ordinal αy such that y ∈ Vαy. Take the highest such ordinal, and now we have a level of the hierarchy that every element of Y is on. Now just go up by one level! Since every element of Y was in the previous level, the set consisting of each of those elements (i.e. Y) is in the power set of that level. So Y actually is on the hierarchy! Contradiction.

This shows that any set you choose shows up on some level of the cumulative hierarchy. Incidentally, the rank of a set is defined as the first level that the set appears on, and one can prove that for each ordinal α, rank(α) = α.

∈-Induction

∈-induction is a really strong principle of induction, because it is induction over all sets, not just the natural numbers or the ordinals. The principle of ∈-induction is the following:

Suppose that for every set X, if Φ(x) holds for every x ∈ X, then Φ(X) holds.
Then Φ holds of every set.

Suppose that this principle was false. Then it’s true that for every set X, if Φ(x) holds for every x ∈ X, then Φ(X) holds, but false that Φ holds of every set. So there’s some set Y such that ¬Φ(Y). We use the far-from-the-tree principle with the property “¬Φ”. This tells us that there’s some set Z such that ¬Φ(Z), but Φ(z) for every z in Z. But we already said that if Φ(z) for every z in Z, then Φ(Z)! Contradiction.

I hope you like these applications as much as I do. They feel to me like very strong and surprising results, especially given how simply they can be proved.

Nonstandard integers, rationals, and reals

In discussions of first-order logic and set theory, one often hears about nonstandard natural numbers and their unusual properties. But rarely do you hear about nonstandard integers, nonstandard rationals, or nonstandard reals. This post will be devoted to the ways in which these other number systems pick up unintended interpretations in first-order logic, and in particular first-order ZFC.

Nonstandard ℕ

Nonstandard interpretations of the natural numbers are the most well-known, so I won’t spend much time on them. In ZFC, the set of natural numbers is called ω, and is defined to be the intersection of all inductive sets. An inductive set is one that includes zero and is closed under successorship (if n is in the set, then so is S(n)).

This definition feels so right when you think about it for a little while that it’s hard to see how it could go wrong. How could there be anything OTHER than the standard natural numbers in the intersection of all inductive sets?

Well, one can prove from the compactness of first order logic that there must be such nonstandard numbers, and the proof is so simple and pretty that it’s probably already appeared on this blog six or seven times. Add to ZFC a constant K and the axioms “K ∈ ω”, “K > 0”, “K > 1”, “K > 2”, and so on for all the standard naturals (this is a decidable set of sentences). Every finite subset of this new theory has a model, so by compactness the theory as a whole also has a model. This model is nonstandard by construction, and is also a model of ZFC by monotonicity (removing axioms never removes models).

So how do nonstandard naturals appear in the intersection of all inductive sets? The key term to focus on is the “all” in “all inductive sets”. While ZFC guarantees the existence of an inductive set (by the axiom of infinity) and every definable subset of this set (by comprehension/separation), it does not actually guarantee the existence of ALL subsets of this set. (This is actually a general feature of first-order theories, that they are able to guarantee the existence of all definable subsets of a set, but not all subsets.) There are models where the inductive set given to us by the axiom of infinity is enormous (uncountably large), and all of its subsets contain (in addition to the standard naturals) infinitely descending membership chains of sets, each of which contains every standard natural. In these models, omega is not just the standard naturals, but also includes these infinite elements, these numbers with infinitely many predecessors. And the proof from compactness shows us that we can’t eliminate these nonstandard natural numbers.

Nonstandard ℤ

The usual way of defining the integers in ZFC is as equivalence classes of pairs of natural numbers. In particular, we define the equivalence relation ~ on ω×ω as follows: (a, b) ~ (c, d) if and only if a + d = b + c. Two ordered pairs of naturals are equivalent under this relation so long as their difference is the same. So (1, 0) is in the same class as (2, 1), which is in the same class as (15, 14).

The idea here is that the integer represented by the equivalence class of (a, b) is supposed to be what we think of as a – b. So the integer 2 is the equivalence class {(2, 0), (3, 1), (4, 2), (5, 3), …}. And the integer -2 is the equivalence class {(0, 2), (1, 3), (2, 4), (3, 5), …}. For shorthand, we’ll write the equivalence class of (a, b) as [a, b]. Thus for each positive integer n, we can write n = [n, 0], and for each negative integer n, n = [0, n].

Addition and multiplication can be defined on the integers in terms of addition and multiplication on ω.

[a, b] + [c, d] = [a + c, b + d]
[a, b] ⋅ [c, d] = [a⋅c + b⋅d, a⋅d + b⋅c]

Check to make sure that this makes sense in light of our interpretation of [a, b] as a – b. We can also define new operations on the integers that didn’t apply to the naturals, like negation.

-[a, b] = [b, a]

And we can define an order on the integers as follows:

[a, b] < [c, d] if and only if a + d < b + c

Check to make sure that this makes sense.

Now, the integers are built out of ω, so it makes sense that the nonstandard interpretations of ω carry over to nonstandard interpretations of ℤ. But there are in fact two ways in which integers can be nonstandard.

(1) The natural number 0 refers to the same object (the empty set) in every model of ZFC. But the set of natural numbers ω refers to different objects in different models, as we saw a moment ago. In this sense, ω is not categorically defined, while 0 is. But since integers are infinite sets of pairs of elements of ω, individual integers aren’t categorically defined either!

For instance, the integer 0 is the set {(n, m) ∈ ω×ω | n = m}. The number of elements in this set is the same as the number of elements of ω. So the integer 0 has as many elements as ω! This means that in a nonstandard model of ω that has nonstandard numbers like K, 0 will also contain nonstandard ordered pairs like (K, K) and (K+1, K+1).

(2) Consider the nonstandard element of ω that we’re calling K, which is larger than every standard natural number. There’s an integer [K, 0], which is the equivalence class of the ordered pair (K, 0). [K, 0] > [1, 0], because K + 0 > 0 + 1. And [K, 0] > [55, 0], because K + 0 > 0 + 55. And so on for every finite integer. Just like ω, there are models in which ℤ has integers larger than all the standard integers.

But unlike ω, in nonstandard models ℤ also has nonstandard integers less than all standard integers. Consider -[K, 0] = [0, K]. Again, you can prove that this integer is less than the integer 0, the integer -55, and so on for every standard integer you can think of.

From now on I’ll write the integer [n, 0] as n and the integer [0, n] as -n.

Nonstandard ℚ

Just as the integers were equivalence classes of pairs of naturals, the rationals are equivalence classes of pairs of integers. We define a new equivalence relation (which I’ll use the same symbol for) as follows:

For integers a, b, c, and d, (a, b) ~ (c, d) if and only if a⋅d = b⋅c.

This equates any two pairs of integers that have the same ratio. So (1, 2) ~ (2, 4) ~ (15, 30), and (-33, 17) ~ (66, -34) ~ (-99, 51). The equivalence classes of this relationship are the rational numbers. Like before, we’ll write [a, b] to represent the equivalence class that (a, b) belongs to. You can think of [a, b] as representing the rational number a/b.

Defining addition, multiplication, subtraction, and division on the rationals is pretty easy:

[a, b] + [c, d] = [a⋅d + b⋅c, b⋅d]
[a, b] ⋅ [c, d] = [a⋅c, b⋅d]
[a, b] – [c, d] = [a⋅d – b⋅c, b⋅d]
[a, b] / [c, d] = [a⋅d, b⋅c]

And the order is similarly easy to define in terms of the order on integers:

[a, b] < [c, d] if and only if a⋅d < b⋅c

Once we’ve moved to ℤ, there’s an explosion of nonstandard elements. For instance, we still have nonstandard rationals larger than every ordinary rational (like [K, 1]). But now we also have rationals like [1, K]. This rational is smaller than every ordinary positive rational [1, 10], [1, 100], [1, 1000], and so on. And it is greater than the rational number 0! So in other words we now have infinitesimal rational numbers!

What’s more, we can construct a rational like [K+1, K]. This guy is infinitesimally larger than the rational number 1. So is [K+1, K⋅2], but this one is even closer to 1 than the previous. And we also have [K+1, K2], which is even closer!

The nonstandard rational number line is crowded with these infinitesimal and infinite elements. The ordinary rationals are usually thought to be dense, but we’ve just seen that there are nonstandard rationals that are “too close together” to fit any standard rationals in. However, between any two nonstandard rationals K and K’, there’s another nonstandard rational (K + K’) / 2.

If you’ve heard of the hyperreals, you might be wondering if any of the nonstandard rationals have the same structure. The simple reason why they don’t is that the hyperreals are complete – there are no gaps – whereas the rationals are not. For instance, ZFC can prove that there is no rational [a, b] such that [a, b] ⋅ [a, b] = 2, but there is a hyperreal with that property. To get completion of the number line, we have to move on to the next step, the reals.

Nonstandard ℝ

The construction of the reals from the rationals is slightly different from the previous constructions. Instead of considering ordered pairs of rationals, we’ll consider sets of rationals. In particular, we’ll look at nonempty sets of rationals that are closed downwards (if n is in the set and m < n, then m is in the set also), and have no greatest element, but don’t include all rationals.

Each real number is one of these subsets, and ℝ is the set of all such subsets. So for instance, the real number √2 is the set {x ∈ ℚ : x2 < 2}. The order on the reals is simple to describe:

x ≤ y if and only if x ⊆ y

Addition is also easily defined:

x + y = {r + s : (r ∈ x) ∧ (s ∈ y)}

Multiplication of reals has a messier definition, but it’s nothing too crazy. And importantly, ZFC can prove that ℝ has the following feature: any non-empty subset of ℝ which is bounded above has a least upper bound.

Now, in nonstandard models, there are real numbers like K, 1/K, 1 + 1/K2 and so on. But now there’s also real numbers like √K, 2K, log(K), and any other function you can define on the reals, applied to infinite reals and infinitesimals. So in one sense, the nonstandard models have many more reals than the “ordinary reals” we learn about in calculus.

But the way we constructed the reals as subsets of the rationals opened up a new type of nonstandard phenomena, in which there end up being many fewer reals that we expect there to be. Remember that we identified reals with subsets of the rationals, and that earlier we said that there is in general no way to guarantee the existence of all subsets of an infinite set. The same applies here; when we say that ℝ = {A ⊆ ℚ | A is a Dedekind cut}, we are actually only guaranteeing the existence of those reals that can be definable as Dedekind cuts. So for instance, ZFC can prove the existence of √2 as a real number, because ZFC can define the Dedekind cut {x ∈ ℚ : x2 < 2}. Same with most real numbers you’re probably familiar with, like π and e. But considering that there are only countably many definitions in ZFC, and uncountably many reals, there must be uncountably many reals that are undefinable! These undefinable reals cannot be proven to exist, and thus there are models in which they don’t exist.

In fact, the set ℝ is the first of the sets we’ve discussed that not only has nonstandard interpretations in which it’s too large, but also nonstandard interpretations in which it’s too small. There are models of ZFC where ℝ has only countably many items! There’s a subtlety here, which is that ZFC can prove that |ℝ| > |ω|. How is this possible if there are models where ℝ is countable?

It’s possible because these models, which are missing many real numbers, are ALSO missing lots of functions, in particular the functions that would put ℝ and ω in bijective correspondence! So even though ℝ is “in reality” countable in these models, the model itself doesn’t know that, because it’s missing the functions that prove its countability.

The Anti-Set Program

In ZFC set theory, we specify a collection of sentences within a first-order language to count as our axioms. The models of this collection of sentences are the set-theoretic universes (and many of these models are “unintended” – pesky perversions in which the set of naturals ω is uncountably large, as one example – but we’ll put this aside for this post). Most of the axioms act as constraints that all sets must follow. For instance, the axiom of pairing says that “For any sets x and y, there must exist another set containing as elements just x and y and nothing else”. This is an axiom that begins with universal quantification over the sets of the universe, and then states some requirement that must hold of all these sets.

The anti-set program is what you get when you take each of these restriction axioms, and negate the restriction it imposes on all sets. So, for instance, the axiom of anti-pairing says that for any sets x and y, there must NOT exist any set {x, y}. Contrast this with the simple negation of pairing, which would tell us only that there exist two sets x and y such that their pair doesn’t exist. The anti-axiom is much stronger than the negated axiom, in that it requires NO pairs to exist.

Not all axioms begin with universal quantifiers, in particular the axiom of infinity, which simply asserts that a set exists that satisfies a certain property. To form the axiom of anti-infinity, we simply negate the original axiom (so that no sets with that property exist).

As it turns out, the anti-set program, if applied to ALL the axioms of ZFC, ends in disaster and paradox. In particular, a contradiction can be derived from anti-comprehension, from anti-replacement, and from anti-extensionality. We don’t handle these cases the same way. Anti-comprehension and anti-replacement are simply discarded, being too difficult to patch. By contrast, anti-extensionality is replaced by ordinary extensionality. What’s up with that? The philosophical justification is simply that extensionality, being the most a priori of the bunch, is needed to justify us calling the objects in our universe sets at all.

There’s one last consideration we must address, which regards the axiom of anti-choice. I am currently uncertain as to whether adding this axiom makes the theory inconsistent. One thing that is currently known about the axiom of anti-choice is that with its addition, out go all finite models (there’s a really pretty proof of this that I won’t include here). In the rest of this post, I will be excluding anti-choice from the axioms and only exploring models of anti-ZF.

With that background behind us, let me list the axioms of anti-ZF.

Anti-Pairing
∀x∀y∀z∃w (w ∈ z ∧ w ≠ x ∧ w ≠ y)
No set is the pair of two others.

Anti-Union
∀x∀y∃z (z ∈ y ∧ ¬∃w (z ∈ w ∧ w ∈ x))
No set is the union of another.

Anti-Powerset
∀x∀y∃z (z ∈ y ∧ ∃w (w ∈ z ∧ w ∉ x))
No set contains all existing subsets of another.

Anti-Foundation
∀x∀y (y ∈ x → ∃z (z ∈ x ∧ z ∈ y))
Every set’s members have at least one element in common with it.

Anti-Infinity
∀x∃y ((y is empty ∧ y ∉ x) ∨ (y ∈ x ∧ ∃z (z = S(y) ∧ z ∉ x)))
Every set either doesn’t contain all empty sets, or has an element whose successor is outside the set.

If the axioms of ZF are considered to be a maximally nice setting for mathematics, then perhaps the axioms of anti-ZF can be considered to be maximally bad for mathematics.

We need to now address some issues involving the axiom of anti-infinity. First, the abbreviations used: “y is empty” is shorthand for “∀z (z ∉ y)” and “z = S(y)” is shorthand for “∀w (w ∈ z ↔ (w ∈ y ∨ w = y))” (i.e. z = y ⋃ {y}).

Second, it turns out that from the other axioms we can prove that there are no empty sets. So the first part of the disjunction is always false, meaning that the second part must always be true. Thus we can simplify the axiom of anti-infinity to the following statement, which is logically equivalent in the context of the other axioms:

Anti-Infinity
∀x∃y (y ∈ x ∧ ∃z (z = S(y) ∧ z ∉ x))
Every set has an element whose successor is outside the set.

Let’s now prove some elementary consequences of these axioms.

Theorems

>> There are no empty sets.
Anti-Union ⊢ ¬∃x∀y (y ∉ x)
Suppose ∃x∀y (y ∉ x). Call this set ∅. So ∀y (y ∉ ∅) Then ⋃∅ = ∅. But this implies that the union of ∅ exists. This contradicts anti-union.

>> There are no one-element sets.
Anti-Pairing ⊢ ¬∃x∃y∀z (z ∈ x ↔ z = y)
Suppose that ∃x∃y∀z (z ∈ x ↔ z = y). Then ∃x∃y∀z (z ∈ x ↔ (z = y ∨ z = y)). But this is a violation of anti-pairing, as then x would be the pair of y and y. Contradiction.

>> There are no two-element sets.
Anti-Pairing ⊢ ¬∃x∃y∃z∀w (w ∈ x ↔ (w = y ∨ w = z))
Suppose that ∃x∃y∃z∀w (w ∈ x ↔ (w = y ∨ w = z)). Then w is the pair of y and z, so anti-pairing is violated. Contradiction.

>> No models of anti-ZF have just N-element sets (for any finite N).
Suppose that a model of anti-ZF had only N-element sets. Take any of these sets and call it X. By anti-infinity, X must contain an element with a successor that is outside the set. Call this element Y and its successor S(Y). Y cannot be its own successor, as then its successor would be inside the set. This means that Y ∉ Y. Also, Y is an N-element set by assumption. But since Y ≠ S(Y), S(Y) must contain all the elements of Y in addition to Y itself. So S(Y) contains N+1 elements. Contradiction.

>> There is no set of all sets.
Suppose that there is such a set, and call it X. By Anti-Infinity, X must contain an element Y with a successor that’s outside of X. But no set is outside of X, by assumption! Contradiction.

>> No N-element model of anti-ZF can have N-1 sets that each contain N-1 elements.
Suppose that this were true. Consider any of these sets that contain N-1 elements, and call it X. By the same argument made two above, this set must contain an element whose successor contains N elements. But there are only N sets in the model, so this is a set of all sets! But we already know that no such set can exist. Contradiction.

>> Every finite set of disjoint sets must be its own choice function.
By a choice function for a set X, I mean a set that contains exactly one element in common with each element of X. Suppose that X is a finite set of disjoint sets. Let’s give the elements of X names: A1, A2, A3, …, AN. By anti-foundation, each An must contain at least one element of X. Since the Ans are disjoint, these elements cannot be the same. Also, since there are only N elements of X, no An can contain more than one element of X. So each element of X contains exactly one element of X. Thus X is a choice function for X!

The next results come from a program I wrote that finds models of anti-ZF of a given size.

>> There are no models of size one, two, three or four.

>> There are exactly two non-isomorphic models of size five.

Here are pictures of the two:

Now, some conjectures! I’m pretty sure of each of these, especially Conjecture 2, but haven’t been able to prove them.

Conjecture 1: There’s always a set that contains itself.

Conjecture 2: There can be no sets of disjoint sets.

Conjecture 3: In an N-element model, there are never less than three sets with fewer than N-1 elements.