Polish Notation and Garden-Path Sentences

Polish notation is a mathematical notation system that allows you to eliminate parentheses without ambiguity. It’s called “Polish” because the name of its Polish creator, Jan Łukasiewicz, was too difficult for people to pronounce.

A motivating example: Suppose somebody says “p and q implies r”. There are two possible interpretations of this: “(p and q) implies r” and “p and (q implies r)”. The usual way to disambiguate these two is to simply add in parentheses like I just did. Another way is to set an order-of-operations convention, like that “and” always applies before “implies”. This is what’s used in basic algebra, and what allows you to write 2 + 2 ⋅ 4 without any fear that you’ll be interpreted as meaning (2 + 2) ⋅ 4.

Łukasiewicz’s method was to make all binary connectives into prefixes. So “A and B” because “and A B”, “P implies Q” becomes “implies P Q”, and so on. In this system, “(p and q) implies r” translates to “implies and p q r”, and “p and (q implies r)” translates to “and p implies q r”. Since the two expressions are different, there’s no need for parentheses! And in general, no ambiguity ever arises from lack of parentheses when using Polish notation.

If this is your first time encountering Polish notation, your first reaction might be to groan and develop a slight headache. But there’s something delightfully puzzling about reading an expression written in Polish notation and trying to understand what it means. Try figuring out what this means: “implies and not p or q s r”. Algebra can be written in Polish notation just as easily, removing the need for both parentheses AND order-of-operations. “2 + 2 = 4” becomes “+ 2 2 = 4”, or even better, “= + 2 2 4”.

Other binary connectives can be treated in Polish notation as well, creating gems like: “If and you’re happy you know it clap your hands!” “When life is what happens you’re busy making plans.” “And keep calm carry on.” “Therefore I think, I am.” (This last one is by of the author the Meditations). Hopefully you agree with me that these sentences have a nice ring to them, though the meaning is somewhat obscured.

But putting connectives in front of the two things being connected is not unheard of. Some examples in English: “ever since”, “because”, “nonwithstanding”, “whenever”, “when”, “until”, “unless”. Each of these connects two sentences, and yet can appear in front of both. When we hear a sentence like “Whenever he cheated on a test the professor caught him”, we don’t have any trouble parsing it. (And presumably you had no trouble parsing that entire last sentence either!) One could imagine growing up in a society where “and” and “or” are treated the same way as “ever since” and “until”, and perhaps in this society Polish notation would seem much more natural!

Slightly related to sentential connectives are verbs, which connect subjects and objects. English places its verbs squarely between the subject and the object, as does Chinese, French, and Spanish. But in fact the most common ordering is subject-object-verb! 45% of languages, including Hindi, Japanese, Korean, Latin, and Ancient Greek, use this pattern. So for instance, instead of “She burned her hand”, one would say “she her hand burned”. This is potentially weirder to English-speakers than Polish notation; it’s reverse Polish notation!

9% of languages use Polish notation for verbs (the verb-subject-object pattern). These include Biblical Hebrew, Arabic, Irish, and Filipino. In such languages, it would be grammatical to say “Loves she him” but not “She loves him”. (3% of languages are VOS – loves him she – 1% are OVS – him loves she – and just a handful are OSV – him she loves).

Let’s return to English. Binary prepositions like “until” appear out front, but they also swap the order of the two things that they connect. For instance, “Until you do your homework, you cannot go outside” is the same as “You cannot go outside until you do your homework”, not “You do your homework until you cannot go outside”, which sounds a bit more sinister.

I came up with some examples of sentences with several layers of these binary prepositions to see if the same type of confusion as we get when examining Polish notation for “and” or “implies” sets in here, and oh boy does it.

Single connective
Since when the Americans dropped the bomb the war ended, some claimed it was justified.

Two connectives, unlayered
Since when the Americans dropped the bomb the war ended, when some claimed it was an atrocity others argued it was justified.

Still pretty readable, no? Now let’s layer the connectives.

One layer
Whenever he was late she would weep.
She would weep whenever he was late.

Two layers
Since whenever he was late she would weep, he hurried over.
He hurried over, since she would weep whenever he was late.

Three layers
Because since whenever he was late she would weep he hurried over, he left his wallet at home.
He left his wallet at home, because he hurried over since she would weep whenever he was late.

Four layers
Because because since whenever he was late she would weep he hurried over he left his wallet at home, when he was pulled over the officer didn’t give him a ticket.
The officer didn’t give him a ticket when he was pulled over, because he left his wallet at home because he hurried over since she would weep whenever he was late.

Five layers
When he heard because because since whenever he was late she would weep he hurried over he left his wallet at home, when he was pulled over the officer didn’t give the man a ticket, the mayor was outraged at the lawlessness.
The mayor was outraged at the lawlessness when he heard the officer didn’t give the man a ticket when he was pulled over because he left his wallet at home because he hurried over since she would weep whenever he was late.

Read that last one out loud to a friend and see if they believes you that it makes grammatical sense! With each new layer, things become more and more… Polish. That is, indecipherable. (Incidentally, Polish is SVO just like English). Part of the problem is that when we have multiple layers like this, phrases that are semantically connected can become more and more distant in the sentence. It reminds me of my favorite garden-path sentence pattern:

The mouse the cat the dog chased ate was digested.
(The mouse that (the cat that the dog chased) ate) was digested.
The mouse (that the cat (that the dog chased) ate) was digested.

The phrases that are meant to be connected, like “the mouse” and “was digested” are sandwiched on either side of the sentence, and can be made arbitrarily distant by the addition of more “that the X verbed” clauses.

Does anybody know of any languages where “and” comes before the two conjuncts? What about “or”? English does this with “if”, so it might not be too much of a stretch.

A Self-Interpreting Book

A concept: a book that starts by assuming the understanding of the reader and using concepts freely, and as you go on it introduces a simple formal procedure for defining words. As you proceed, more and more words are defined in terms of the basic formal procedure, so that halfway through, half of the words being used are formally defined, and by the end the entire thing is formally defined. Once you’re read through the whole book, you can start it over and read from the beginning with no problem.

I just finished a set theory textbook that felt kind of like that. It started with the extremely sparse language of ZFC: first-order logic with a single non-logical symbol, ∈. So the alphabet of the formal language consisted of the following symbols: ∈ ( ) ∧ ∨ ¬ → ↔ ∀ ∃ x ‘. It could have even started with a sparser formal language if it was optimizing for alphabet economy: ∈ ( ∧ ¬ ∀ x ‘ would suffice. As time passed and you got through more of the book, more and more things were defined in terms of the alphabet of ZFC: subsets, ordered pairs, functions from one set to another, transitivity, partial orders, finiteness, natural numbers, order types, induction, recursion, countability, real numbers, and limits. By the last chapter it was breathtaking to read a sentence filled with complex concepts and realize that every single one of these concepts was ultimately grounded in this super simple formal language we started with, with a finitistic sound and complete system of rules for how to use each one.

But could it be possible to really fully define ALL the terms used by the end of the book? And even if it were, could the book be written in such a way as to allow an alien that begins understanding nothing of your language to read it and, by the end, understand everything in the book? Even worse, what if the alien not only understands nothing of your language, but starts understanding nothing of the concepts involved? This might be a nonsensical notion; an alien that can read a book and do any level of sophisticated reasoning but doesn’t understand concepts like “and” and “or“.

One way that language is learned is by “pointing”: somebody asks me what a tree is, so I point to some examples of trees and some examples of non-trees, clarifying which is and which is not. It would be helpful if in this book we could point to simple concepts by means of interactive programs. So, for instance, an e-book where an alien reading the book encounters some exceedingly simple programs that they can experiment with, putting in inputs and seeing what results. So for instance, we might have a program that takes as input either 00, 01, 10, or 11, and outputs the ∧ operation applied to the two input digits. Nothing else would be allowed as inputs, so after playing with the program for a little bit you learn everything that it can do.

One feature of such a book would be that it would probably use nothing above first-order logical concepts. The reason is that the semantics of second-order logic cannot be captured by any sound and complete proof system, meaning that there’s no finitistic set of rules one could explain to an alien so that they know how to use the concepts involved correctly. Worse, the set of second-order tautologies is not even recursively enumerable (worse than the set of first-order tautologies, which is merely undecidable), so no amount of pointing-to-programs would suffice. First-order ZFC can define a lot, but can it define enough to write a book on what it can define?

Finiteness can’t be captured in a sound, complete, finitary proof system

Consider the sentence “This blog has finitely many posts.” Do you understand what this sentence means? Then no set of rules (even infinite, even uncomputable!) can model your reasoning. This claim may sound shocking, but it can be justified on solid metamathematical grounds.

Another example: the sentence “There are finitely many planets in the universe.” You don’t have to think it’s true, you just have to think you understand what it means. What’s the common theme? It’s the notion of there being ‘finitely many’ of some class of objects. Let’s imagine building a language that has the expressive resources of first-order logic (which are quite modest), plus an additional quantifier F, whose semantics are given by the following rule: Fx φ(x) is satisfied by a model iff there are only finitely many objects in that model that satisfy φ(x).

It turns out that any language consisting of first order logic plus the quantifier F can’t be axiomatized in any sound, complete, and finitary proof system. Notice that effective enumerability of the rules of the proof system is not a requirement here! So long as the language is strong enough to express the semantics of {∧, ¬, ∀, F, variables xn, and relations Rn}, no set of sentences and sentence-manipulation rules in that language will suffice to capture these semantics.

Here’s a proof: consider the first-order theory of Peano arithmetic. This theory has nonstandard models (as any theory of arithmetic must have in a logic that is compact). All of these nonstandard models have the following feature: that there are numbers that are larger than infinitely many numbers. Think about what this means: this is a common feature of all nonstandards, so if we could write a sentence to express this feature then we could rule them all out! This is where the quantifier F steps in. With F, we can write the following simple sentence and add it to PA as an axiom:

∀x Fy (y < x)

In English: every number has finitely many numbers less than it. And with that, we’ve ruled out all nonstandard models! So now we have a theory that is categorical for ℕ. And that’s a big deal, metamathematically speaking!

Why? Well, as I’ve talked about in a previous post, you can prove some pretty strong limitative results about any logic that can produce a theory that’s categorical for ℕ. In particular, if we can produce such a theory then its logic cannot be compact. Quick proof: suppose a set of sentences Σ models ℕ. Add to Σ a constant c and the axioms “c ≠ 0”, “c ≠ 1”, “c ≠ 2”, and so on, and call this new set Σ’. Every finite subset of Σ’ models ℕ. So by compactness, Σ’ has a model. But this model is nonstandard – it contains numbers that aren’t natural numbers. And since Σ is a subset of Σ’, any model of Σ’ is also a model of Σ.

So compactness implies that no theory is categorical for ℕ. But compactness follows from the following three properties: a sound and complete proof system (Σ ⊢ α if and only if Σ ⊨ α), and that all proofs are only finitely long (try expressing this property without F!). Quick proof: If a set of sentences is finitely satisfied, then every finite subset of it has a model (by definition), so no finite subset of it can be refuted (by soundness), so the entire set can’t be refuted (by finite proofs), so the entire set is satisfied (by completeness).

So soundness + completeness + finiteness ⇒ compactness ⇒ the existence of nonstandard models of arithmetic in any theory that models ℕ. Which means that the semantics of F cannot be captured in any sound, complete, and finite proof system!

Take your pick: either you don’t really understand the semantics of the “finitely many” quantifier F, or no set of rules (not even requiring this set to be finite or computable) can fully capture your reasoning in finite-length proofs.

More information about related extensions of first-order logic and their limitations can be found here. The result I describe here is a rephrasing of results discussed there.

Meaning ain’t in the brain

I don’t know if there’s a name for the position that the meanings of our terms is pinned down by facts about the brain. The closest I know is semantic internalism, but a semantic internalist could think that meaning is pinned down by facts about qualia, which happen to not be facts about the brain. So I’ll make up a name for this position: call it physicalist semantic internalism.

Now, here’s an argument against physicalist semantic internalism that seems totally right to me.

What I mean by “second-order logical concepts” is the concepts of “and”, “or”, “not”, second-order quantifiers (“for all” and “for some”, ranging over not just objects but properties of objects), and the notions of functions, relations, and concepts.

  1. The semantics of second order logic captures what I mean when I use second-order logical concepts.
  2. No finite set of rules (and correspondingly no finite machine) can pin down the semantics of second order logic.
  3. So no finite machine pins down what I mean when I use second-order logical concepts.
  4. My brain is a finite machine.
  5. So my brain does not pin down what I mean when I use second-order logical concepts.

And here’s another argument along similar lines:

  1. The truth values of sentences about integers are determined by what we mean by integers.
  2. The statement of the satisfiability of each Diophantine equation has a determinate truth value.
  3. The statement of the satisfiability of each Diophantine equation is a statement about integers.
  4. So the satisfiability of each Diophantine equation is fixed by what we mean by integers.
  5. No finite machine can fix the satisfiability of each Diophantine equation.
  6. Our brain is a finite machine.
  7. So the meaning of integers is not contained in the brain.

On philosophical progress

A big question in the philosophy of philosophy is whether philosophers make progress over time. One relevant piece of evidence that gets brought up in these discussions is the lack of consensus on age old questions like free will, normative ethics, and the mind body problem. If a discipline is progressing steadily towards truth with time, the argument goes, then we should expect that questions that have been discussed for thousands of years should be more or less settled by now. After all, that is what we see in the hard sciences; there are no lingering disputes over the validity of vitalism or the realm of applicability of Newtonian mechanics.

There are a few immediate responses to this line of argument. It might be that the age old questions of philosophy are simply harder than the questions that get addressed by physicists or biologists. “Harder” doesn’t mean “requires more advanced mathematics to grapple with” here, but something more like “it’s unclear what even would count as a conclusive argument for one position or another, and therefore much less clear how to go about building consensus.” Try to imagine what sort of argument would convince you of the nonexistence of libertarian free will with the same sort of finality as a demonstration of time dilation convinces you of the inadequacy of nonrelativistic mechanics.

A possible rejoinder at this point would be to take after the logical positivists and deny the meaningfulness or at least truth-aptness of the big questions of philosophy as a whole. This may go too far; it may well be that a query is meaningful but, due to certain epistemic limitations of ours, forever beyond our ability to decide. (We know for sure that such queries can exist, due to Gödelian discoveries in mathematics. For instance, we know of the existence of a series of numbers that are perfectly well defined, but for which no algorithm can exist to enumerate all of them. The later numbers in this sequence will forever be a mystery to us, and not for lack of meaningfulness.)

I think that the roughly correct position to take is that science is largely about examining empirical facts-of-the-matter, whereas philosophy is largely about analyzing and refining our conceptual framework. While we have a fairly clear set of standards for how to update theories about the empirical world, we are not in possession of such a set of standards for evaluating different conceptual frameworks. The question of “what really are the laws governing the behavior of stuff out there” has much clearer truth conditions than a question like “what is the best way to think about the concepts of right and wrong”; i.e. It’s clearer what counts as a good answer and what counts as a bad answer.

When we’re trying to refine our concepts, we are taking into account our pre-theoretical intuitions (e.g. any good theory of the concept of justice must have something to do with our basic intuitive conception of justice). But we’re not just satisfied to describe the concept solely as the messy inconsistent bundle of intuitions that constitute our starting position on it. We also aim to describe the concept simply, by developing a “theory of justice” that relies on a small set of axioms and from which (the hope is) the rest of our conclusions about justice follow. We want our elaboration of the concept to be consistent, in that we shouldn’t simultaneously affirm that A is an instance of the concept and that A is not an instance of the concept. Often we also want our theory to be precise, even when the concept itself has vague boundaries.

Maybe there are other standards besides these, intuitiveness, simplicity, consistency, and precision. And the application of these standards is very rarely made explicit. But one thing that’s certain is that different philosophers have different mixes of these values. One philosopher might value simplicity more or less than another, and it’s not clear that one of them is doing something wrong by having different standards. Put another way, I’m not convinced that there is one unique right set of standards for conceptual refinement.

We may want to be subjectivists to some degree about philosophy, and say that there are a range of rationally permissible standards for conceptual refinement, none better than any other. This would have the result that on some philosophical questions, multiple distinct answers may be acceptable but some crazy enough answers are not. Maybe compatibilism and nihilism are acceptable stances on free will but libertarianism is not. Maybe dualism and physicalism are okay but not epiphenomenalism. And so on.

This view allows for a certain type of philosophical progress, namely the gradual ruling out of some philosophical positions as TOO weird. It also allows for formation of consensus, through the discovery of philosophical positions that are the best according to all or most of the admissible sets of standards. I think that one example of this would be the relatively recent rise of Bayesian epistemology in philosophy of science, and in particular the Bayesian view of scientific evidence as being quantified by the Bayes factor. In brief, what does it mean to say that an observation O gives evidence for a hypothesis H? The Bayesian not only has an answer to this, but to the more detailed question of to what degree O gives evidence for H. The quantity is cr(O | H) / cr(O), where cr(.) is a credence function encoding somebody’s beliefs before observing O. If this quantity is equal to 1, then O is no evidence for H. If it is greater than 1, then O is evidence for H. And if it’s less than 1, then O is evidence against H.

Not everything in Bayesian epistemology is perfectly uncontroversial, but I would argue that on this particular issue – the issue of how to best formalize the notion of scientific evidence – the Bayesian definition survives all its challenges unscathed. What are some other philosophical questions on which you think there has been definite progress?

Logic on Planet Zorko

A group of Zorkan mathematicians are sitting around having a conversation in a language that you are unfamiliar with. You are listening in with a translator. This translator is an expert in formal logic, and has decided to play the following game with you. He says:

“After listening to the full conversation, I will translate all the sentences that were said for you. But I won’t translate them into English; I want something more universal. Instead, I will choose a formal language that captures the mathematical content of all the sentences said, while leaving out the vagaries and subtleties of the Zorkan language. I will describe to you the semantics of the formal language I choose, if you don’t already know it.”

“Furthermore,” (says the translator) “I happen to be intimately familiar with Zorkan society and culture. The Zorkans are having a discussion about one particular mathematical structure, and I know which one that is. The mathematicians are all fantastically precise reasoners, such that none of them ever says a sentence that is false of the structure that they are discussing.”

(So for instance if they are talking about the natural numbers, then no mathematician will say “0 = 1”, and if they are talking about abelian groups, then no mathematician will say “∃x∃y (xy ≠ yx)”. But they could say “∃x∃y (xy ≠ yx)” if they are talking about non-abelian groups.)

You know nothing about Zorkan psychology, besides that the Zorkan way of life is so utterly foreign to you that you cannot reliably assume that the mathematical structures that come most naturally to you will also come naturally to them. It might be, for instance, that nonstandard models of arithmetic are much more intuitive to them than the natural numbers. You cannot assume that the structure they are discussing is the one that you think is “most natural”; you can only conclude this if one of them says a sentence that is true of that model and no others.

The conversation finishes, and you are tasked with answering the following two questions:

(1) What structure are they talking about?
(2) Can you come up with a verification procedure for the mathematicians’ sentences (including possible future sentences they might say on the topic)?

So, that’s the setup. Now, the question I want you to consider is the following: Suppose that the structure that the mathematicians have in mind is actually the natural numbers. Is there some conversation, any conversation at all (even allowing infinitely long conversations, and uncomputable conversations – conversations which cannot be produced as the output of any Turing machine), that the mathematicians could have, and some translation of this conversation, such that you can successfully answer both (1) and (2)? If so, what is that conversation? And if not, then why not?

✯✯✯

Let’s work out some simple examples.

Example 1

Suppose the conversation is translated into a propositional language with three atomic propositions {P, Q, R}.

Mathematician A: “P ∨ Q”
Mathematician B: “(Q ∨ R) → (¬P)”
Mathematician C: “R”

From this conversation, you can deduce that the model they are talking about is the one that assigns “False” to P, “True” to Q, and “True” to R.

M: {P is false, Q is true, R is true}

This is the answer to the question 1!

As for the second question, we want to know if there’s some general procedure that produces all the future statements the mathematicians could make. For instance, the set generated by our procedure should include (Q ∧ R) but not (Q ∧ P).

It turns out that such a procedure does exist, and is not too difficult to write out and implement.

Example 2

Take the above conversation and modify it slightly:

Mathematician A: “P ∨ Q”
Mathematician B: “(Q ∨ R) → (¬P)”
Mathematician C: “¬R”

If you work it out, you’ll see that question 1 can no longer be answered unambiguously. The problem is that there are multiple models of the sentences that the mathematicians are saying:

M1: {P is false, Q is true, R is false}
M2: {P is true, Q is false, R is false}

So even though they have one particular structure in mind, you don’t have enough information from their conversation to figure out exactly what that structure is.

Now let’s think about the answer to question 2. We don’t know whether the mathematicians are thinking about M1 or M2, and M1 and M2 differ in what truth value they assign the proposition P. So we can’t construct an algorithm that will generate the set of all their possible future statements, as this would require us to know, in particular, whether P is true or false in the model that they have in mind.

We might suspect that this holds true generally: if you can’t answer question 1, then you won’t be able to answer question 2 either. But we might also wonder: if we can answer question 1, then can we also always answer question 2?

The answer is no, as the next example will show.

Example 3

For this conversation, the translation is in second-order logic. This will allow us to talk about more interesting mathematical structures than before; namely, structures that have a domain of objects on which functions and predicates can act. In particular, we’re in a second-order language with one constant symbol “c” and one function symbol “f”. Here’s the conversation:

Mathematician A: ¬∃x (f(x) = c)
Mathematician B: ¬∃x∃y ((f(x) = f(y)) ∧ ¬(x = y))
Mathematician C: ∀R (R(c) ∧ ∀x(R(x) → R(f(x))) → ∀x R(x))

Notice that the only truly second-order sentence is the third one, in which we quantify over a predicate variable R rather than an individual variable x, y, z, …. But the second-order status of this sentence it makes it that the translator could not have possibly translated this conversation into a first-order language, much less a propositional language.

This time, questions 1 and 2 are much harder to answer than before. But if you work it out, you’ll see that there is exactly one mathematical structure that satisfies all three of the mathematicians’ statements. And that structure is the natural numbers!

So, we know exactly what structure the mathematicians have in mind. But can we also answer question 2 in the positive? Can we produce some verification procedure that will allow us to generate all the future possible sentences the mathematicians could say? Unfortunately, the answer is no. There is no sound and complete proof system for second-order logic, so in particular, we have no general algorithm for producing all the truths in this second order language. So sad.

Example 4

Now let’s move to first-order logic for our final example. The language of translation will be a first order language with a constant symbol for every natural number {0,1,2,3,…}, function symbols for ordinary arithmetic {+, ×}, and relation symbols for orders {≥}

Imagine that the conversation consists of literally all the first-order sentences in the language that are true of the natural numbers. Anything which you can say in the language, and which is true as a statement about ℕ, will be said at some point. This will obviously be a very long conversation, and in fact infinitely long, but that’s fine. It will include sentences like “0 ≠ 1”, “0 ≠ 2”, “0 ≠ 3”, and so on.  (These Zorkans are extremely thorough.)

Given this conversation, can we answer (1) and (2)? Take a guess; the answer may surprise you!

It turns out that even though we can answer (2) positively – we can actually produce an algorithm that will generate one-by-one all the possible future statements of the mathematicians (which really means all the sentences in the language that are true of the natural numbers), we cannot answer (1) positively! There are multiple distinct mathematical structures that are compatible with the entirety of true statements about natural numbers in the language. Earlier we hypothesized that any time we have a negative answer to (1), we will also have a negative answer to (2). But this is not true! We can verify all the true statements about natural numbers in the language… without even knowing that we’re actually talking about the natural numbers! This is an important and unintuitive consequence of the expressive limitations (and in particular, of the compactness) of first-order logic.

The Takeaway

We had an example where we could answer both (1) and (2) for a simple mathematical structure (a model of propositional logic). And we saw examples for natural numbers where we could answer (1) but not (2), as well as examples where we could answer (2) but not (1). But we haven’t yet seen an example for natural numbers where we had both (1) and (2). This is no coincidence!

It is actually a consequence of the theorem I proved and discussed in my last post that no such such conversation can exist. When structures at least as complicated as the natural numbers are being discussed in some language (call it L), you cannot simultaneously (1) know for sure what structure is being talked about and (2) have an algorithmic verification system for L-sentences about the structure.

Crazy conditionals

It’s well known that the material implication → of propositional logic does not do a perfect job of capturing what we mean when we make “if… then…” statements in English. The usual examples of failure rest on the fact that any material conditional with a false antecedent is vacuously true (so “if 2 is odd then 2 is even” turns out to be true). But over time, philosophers have come up with a whole lot of different ways in which → can catch us by surprise.

Here’s a list of some such cases. In each case, I will present an argument using if…then… statements that is clearly invalid, but which is actually valid in propositional logic if the if…then… statements are translated as the material conditional!

1. Harper

If I put sugar into my coffee, it will taste fine.
Therefore, if I put sugar and motor oil into my coffee, it will taste fine.

S → F
(S ∧ M) → F

2. Distributivity

If I pull both switch A and switch B, the engine will start.
Therefore, either the engine will start if I pull switch A or the engine will start if I pull switch B.

(A ∧ B) → S
(A → S) ∨ (B → S)

3. Transitivity

If Biden dies before the election, Trump will win.
If Trump wins the election, Biden will retire to his home.
Therefore, if Biden dies before the election, Biden will retire to his home.

B → T
T → R
B → R

4. Principle of Explosion

Either zombies will rise from the ground if I bury a chicken head in my backyard, or zombies will rise from the ground if I don’t bury a chicken head in my backyard.

(B → D) ∨ (¬B → D) is a tautology

5. Contraposition

If I buy a car, I won’t buy a Pontiac.
Therefore, if I buy a Pontiac, I won’t buy a car.

C → ¬P
P → ¬C

6. Simplification

If John is in London then he’s in England, and if he’s in Paris then he’s in France.
Therefore, either (1) if John’s in London he’s in France or (2) if John’s in Paris then he’s in England.

(L → E) ∧ (P → F)
(L → F) ∨ (P → E)

7. False Antecedent

It’s not the case that if God exists then human life is a product of random chance.
Therefore, God exists.

¬(G → C)
G

8. True Consequent

If I will have eternal life if I believe in God, then God must exist.
I do not believe in God.
Therefore, God exists.

(B → E) → G
~B
G

You can check for yourself that each of these is logically valid! Can you figure out what’s going wrong in each case?

Describing the world

Wittgenstein starts his Tractatus Philosophicus with the following two sentences.

1. The world is everything that is the case.

1.1 The world is the totality of facts, not of things.

Let’s take him up on this suggestion and see how far we get. In the process, we’ll discover some deep connections to theorems in mathematical logic, as well as some fascinating limitations on the expressive powers of propositional and first order logic.

We start out with a set of atomic propositions. For a very simple world, we might only need a finite number of these: “Particle 1 out of 3 has property 1 out of 50”, “Particle 2 of 3 has property 17 out of 50”, and so on. More realistically, the set of atomic propositions will be infinite (countable if the universe doesn’t have any continuous properties, and uncountable otherwise).

For simplicity, we’ll imagine labeling our set of atomic propositions P1, P2, P3, and so on (even though this entails that there are at most countably many, nothing important will rest on this assumption.) We combine these atomic propositions with the operators of propositional logic {(, ), ¬, ∧, ∨, →}. This allows us to build up more complicated propositions, like ((P7∧P2)→(¬P13)). This will be the language that we use to describe the world.

Now, the way that the world is is just a consistent assignment of truth values to the set of all grammatical sentences in our language. For example, one simple assignment of truth values is the one that assigns “True” to all atomic propositions. Once we’ve assigned truth values to all the atomic propositions, we get the truth values for the rest of the set of grammatical sentences for free, by the constraint that our truth assignment be consistent. (For instance, if P1 and P2 are both true, then (P1∧P2) must also be true.)

Alright, so the set of ways the world could be corresponds to the set of truth assignments over our atomic propositions. The final ingredient is the notion that we can encode our present knowledge of the world as a set of sentences. Maybe we know by observation that P5 is true, and either P2 or P3 is true but not both. Then to represent this state of knowledge, we can write the following set of sentences:

{P5, (P2∨P3), ¬(P2∧P3)}

Any set of sentences picks out a set of ways the world could be, such that each of these possible worlds is compatible with that knowledge. If you know nothing at all, then the set of sentences representing your knowledge will be the empty set {}, and the set of possible worlds compatible with your knowledge will be the set of all possible worlds (all possible truth assignments). On the other extreme, you might know the truth values of every atomic proposition, in which case your state of knowledge uniquely picks out one possible world.

In general, as you add more sentences to your knowledge-set, you cut out more and more possible worlds. But this is not always true! Ask yourself what the set of possible worlds corresponding to the set {(P1∨¬P1), (P2∨¬P2), (P3∨¬P3)} is. Since each of these sentences is a tautology, no possible worlds are eliminated! So our set of possible worlds is still the set of all worlds.

Now we get to an interesting question: clearly for any knowledge-set of sentences, you can express a set of possible worlds consistent with that knowledge set. But is it the case that for any set of possible worlds, you can find a knowledge-set that uniquely picks it out? If I hand you a set of truth assignment functions and ask you to tell me a set of propositions which are consistent with that set of worlds and ONLY that set, is that always possible? Essentially, what we’re asking is if all sets of possible worlds are describable.

We’ve arrived at the main point of this essay. Take a minute to ponder this and think about whether it’s possible, and why/why not! For clarification, each sentence can only be finitely long. But! You’re allowed to include an infinity of sentences.

(…)

(Spoiler-hiding space…)

(…)

If there were only a finite number of atomic propositions, then you could pick out any set of possible worlds with just a single sentence in conjunctive normal form. But when we start talking about an infinity of atomic propositions, it turns out that it is not always possible! There are sets of possible worlds that are literally not describable, even though our language includes the capacity to describe each of those words and we’re allowed to include an infinite set of sentences.

There’s a super simple proof of this. Let’s give a name to the cardinality of the set of sentences: call it K. (We’ve been tacitly acting as if the cardinality is countable this whole time, but that doesn’t actually matter.) What’s the cardinality of the set of all truth assignments?

Well, each truth assignment is a function from all sentences to {True, False}. And there are 2K such assignments. 2K is strictly larger than K, so there are more possible worlds than there are sentences. Now, the cardinality of the set of sets of sentences is also 2K. But the set of SETS of truth of assignments is 22^K!

What this means is that we can’t map sets of sentences onto sets of truth assignments without leaving some things out! This proof carries over to predicate logic as well. The language for both propositional and predicate logic is unable to express all sets of possible worlds corresponding to that language!

I love this result. It’s the first hint in mathematical logic that syntax and semantics can come apart.

That result is the climax of this post. What I want to do with the rest of this post is to actually give an explicit example of a set of truth assignments that are “indescribable” by any set of sentences, and to prove it. Warning: If you want to read on, things will get a bit more technical from here.

Alright, so we’ll use a shortcut to denote truth assignments. A truth assignment will be written as a string of “T”s and “F”s, where the nth character corresponds to how the truth assignment evaluates Pn. So the all-true truth assignment will just be written “TTTTTT…” and the all-false truth assignment will be written “FFFFF…”. The truth assignment corresponding to P1 being false and everything else true will be written “FTTTTT…”. And so on.

Now, here’s our un-describable set of truth assignments. {“FFFFFF…”, “TFFFFF…”, “TTFFFF…”, “TTTFFF…”, …}. Formally, define Vn to be the truth assignment that assigns “True” to every atomic proposition up to and including Pn, and “False” to all others. Now our set of truth assignments is just {Vn | n ∈ ℕ}.

Let’s prove that no set of sentences uniquely picks out this set of truth assignments. We prove by contradiction. Suppose that we could find a set of sentences that uniquely pick out these truth assignments and none other. Let’s call this set A. Construct a new set of sentences A’ by appending all atomic propositions A: A’ = A ∪ {P1, P2, P3, …}.

Is there any truth assignment that is consistent with all of A’? Well, we can answer this by using the Compactness Theorem: A’ has a truth assignment if and only if every finite subset of A’ has a truth assignment. But every finite subset of A’ involves sentences from A (which are consistent with Vn for each n by assumption), and a finite number of atomic propositions. Since each finite subset of A’ is only asserting the truth of a finite number of atomic sentences, we can always find a truth assignment Vk in our set that is consistent with it, by choosing one that switches to “False” long after the last atomic proposition that is asserted by our finite subset.

This means that each finite subset of A’ is consistent with at least one of our truth assignments, which means that A’ is consistent with at least one of our truth assignments. But A’ involves the assertion that all atomic propositions are true! The only truth assignment that is consistent with this assertion is the all-true assignment! And is that truth assignment in our set? No! And there we have it, we’ve reached our contradiction!

We cannot actually describe a set of possible worlds in which either all atomic propositions are false, or only the first is true, or only the first two are true, or only the first three are true, and so on forever. But this might prompt the question: didn’t you just describe it? How did you do that, if it’s impossible? Well, technically I didn’t describe it. I just described the first four possibilities and then said “and so on forever”, assuming that you knew what I meant. To have actually fully pinned down this set of possible worlds, I would have had to continue with this sentence forever. And importantly, since this sentence is a disjunction, I could not split this infinite sentence into an infinite set of finite sentences. This fundamental asymmetry between ∨ and ∧ is playing a big role here: while an infinite conjunction can be constructed by simply putting each clause in the conjunction as a separate sentence, an infinite disjunction cannot be. This places a fundamental limit on the ability of a language with only finite sentences to describe the world.

Moving Naturalism Forward: Eliminating the macroscopic

Sean Carroll, one of my favorite physicists and armchair philosophers, hosted a fantastic conference on philosophical naturalism and science, and did the world a great favor by recording the whole thing and posting it online. It was a three-day long discussion on topics like the nature of reality, emergence, morality, free will, meaning, and consciousness. Here are the videos for the first two discussion sections, and the rest can be found by following Youtube links.

 

Having watched through the entire thing, I have updated a few of my beliefs, plan to rework some of my conceptual schema, and am puzzled about a few things.

A few of my reflections and take-aways:

  1. I am much more convinced than before that there is a good case to be made for compatibilism about free will.
  2. I think there is a set of interesting and challenging issues around the concept of representation and intentionality (about-ness) that I need to look into.
  3. I am more comfortable with intense reductionism claims, like “All fact about the macroscopic world are entailed by the fundamental laws of physics.”
  4. I am really interested in hearing Dan Dennett talk more about grounding morality, because what he said was starting to make a lot of sense to me.
  5. I am confused about the majority attitude in the room that there’s not any really serious reason to take an eliminativist stance about macroscopic objects.
  6. I want to find more details about the argument that Simon DeDeo was making for the undecidability of questions about the relationship between macroscopic theories and microscopic theories (!!!).
  7. There’s a good way to express the distinction between the type of design human architects engage in and the type of design that natural selection produces, which is about foresight and representations of reasons. I’m not going to say more about this, and will just refer you to the videos.
  8. There are reasons to suspect that animal intelligence and capacity to suffer are inversely correlated (that is, the more intelligent an animal, the less capacity to suffer it likely has). This really flips some of our moral judgements on their head. (You must deliver a painful electric shock to either a human or to a bird. Which one will you choose?)

Let me say a little more about number 5.

I think that questions about whether macroscopic objects like chairs or plants really REALLY exist, or whether there are really only just fermions and bosons are ultimately just questions about how we should use the word “exist.” In the language of our common sense intuitions, obviously chairs exist, and if you claim otherwise, you’re just playing complicated semantic games. I get this argument, and I don’t want to be that person that clings to bizarre philosophical theses that rest on a strange choice of definitions.

But at the same time, I see a deep problem with relying on our commonsense intuitions about the existence of the macro world. This is that as soon as we start optimizing for consistency, even a teeny tiny bit, these macroscopic concepts fall to pieces.

For example, here is a trilemma (three statements that can’t all be correct):

  1. The thing I am sitting on is a chair.
  2. If you subtract a single atom from a chair, it is still a chair.
  3. Empty space is not a chair.

These seem to me to be some of the most obvious things we could say about chairs. And yet they are subtly incoherent!

Number 1 is really shorthand for something like “there are chairs.” And the reason why the second premise is correct is that denying it requires that there be a chair such that if you remove a single atom, it is no longer a chair. I take it to be obvious that such things don’t exist. But accepting the first two requires us to admit that as we keep shedding atoms from a chair, it stays a chair, even down to the very last atom. (By the way, some philosophers do actually deny number 2. They take a stance called epistemicism, which says that concepts like “chair” and “heap” are actually precise and unambiguous, and there exists a precise point at which a chair becomes a non-chair. This is the type of thing that makes me giggle nervously when reflecting on the adequacy of philosophy as a field.)

As I’ve pointed out in the past, these kinds of arguments can be applied to basically everything in the macroscopic world. They wreak havoc on our common sense intuitions and, to my mind, demand rejection of the entire macroscopic world. And of course, they don’t apply to the microscopic world. “If X is an electron, and you change its electric charge a tiny bit, is it still an electron?” No! Electrons are physical substances with precise and well-defined properties, and if something doesn’t have these properties, it is not an electron! So the Standard Model is safe from this class of arguments.

Anyway, this is all just to make the case that upon close examination, our commonsense intuitions about the macroscopic world turn out to be subtly incoherent. What this means is that we can’t make true statements like “There are two cars in the garage”. Why? Just start removing atoms from the cars until you get to a completely empty garage. Since no single-atom change can make the relevant difference to “car-ness”, at each stage, you’ll still have two cars!

As soon as you start taking these macroscopic concepts seriously, you find yourself stuck in a ditch. This, to me, is an incredibly powerful argument for eliminativism, and I was surprised to find that arguments like these weren’t stressed at the conference. This makes me wonder if this argument is as powerful as I think.

Defining racism

How would you define racism?

I’ve been thinking about this lately in light of some of the scandal around research into race and IQ. It’s a harder question than I initially thought; many of the definitions that pop to mind end up being either too strong or too weak. The term also functions differently in different contexts (e.g. personal racism, institutional racism, racist policies). In this post, I’m specifically talking about personal racism – that term we use to refer to the beliefs and attitudes of those like Nazis or Ku Klux Klan members (at the extreme end).

I’m going to walk through a few possible definitions. This will be fairly stream-of-consciousness, so I apologize if it’s not incredibly profound or well-structured.

Definition 1 Racism is the belief in the existence of inherent differences between the races.

‘Inherent’ is important, because we don’t want to say that somebody is racist for acknowledging differences that can ultimately be traced back to causes like societal oppression. The problem with this definition is that, well, there are inherent differences between the races.

The Chinese are significantly shorter than the Dutch. Raising a Chinese person in a Dutch household won’t do much to equalize this difference. What’s important, it seems, is not the belief in the existence of inherent differences, but instead the belief in the existence of inherent inferiorities and superiorities. So let’s try again.

Definition 2 Racism is the belief in the existence of inherent racial differences that are normatively significant.

This is pretty much the dictionary definition of the term “racism”. While it’s better, there are still some serious problems. Let’s say that somebody discovered that the Slavs are more inherently prone to violence than, say, Arabs. Suppose that somebody ran across this fact, and that this person also held the ethical view that violent tendencies are normatively important. That is, they think that peaceful people are ethically superior to violent people.

If they combine this factual belief with this seemingly reasonable normative belief, they’ll end up being branded as a racist, by our second definition. This is clearly undesirable… given that the word ‘racism’ is highly normatively loaded, we don’t want it to be the case that somebody is racist for believing true things. In other words, we probably don’t want our definition of racism to ever allow it to be the right attitude to take, or even a reasonable attitude to take.

Maybe the missing step is the generalization of attitudes about Slavs and Arabs to individuals. This is a sentiment that I’ve heard fairly often… racism is about applying generalizations about groups to individuals (for instance, racial profiling). Let’s formalize this:

Definition 3 Racism is about forming normative judgments about individuals’ characteristics on the basis of beliefs about normative group-level differences.

This sounds nice and all, but… you know what another term for “applying facts about groups to individuals” is? Good statistical reasoning.

If you live in a town composed of two distinct populations, the Hebbeberans and the Klabaskians, and you know that Klabaskians are on average twenty times more likely than Hebbeberans to be fatally allergic to cod, then you should be more cautious with serving your extra special cod sandwich to a Klabaskian friend than to a Hebbeberan.

Facts about populations do give you evidence about individuals within those populations, and the mere acknowledgement of this evidence is not racist, for the same reason that rationality is not racist.

So if we don’t want to call rationality racist, then maybe our way out of this is to identify racism with irrationality.

Definition 4 Racism is the holding of irrational beliefs about normative racial differences.

Say you meet somebody from Malawi (a region with an extremely low average IQ). Your first rational instinct might be to not expect too much from them in the way of cognitive abilities. But now you learn that they’re a theoretical physicist who’s recently been nominated for a Nobel prize for their work in quantum information theory. If the average IQ of Malawians is still factoring in at all to your belief about this person’s intelligence, then you’re being racist.

I like this definition a lot better than our previous ones. It combines the belief in racial superiority with irrationality. On the other hand, it has problems as well. One major issue is that there are plenty of cases of benign irrationality, where somebody is just a bad statistical reasoner, but not motivated by any racial hatred. Maybe they over-updated on some piece of information, because they failed to take into account an important base-rate.

Well, the base-rate fallacy is one of the most common cognitive biases out there. Surely this isn’t enough to make them a racist? What we want is to capture the non-benign brand of irrational normative beliefs about race – those that are motivated by hatred or prejudice.

Definition 5 Racism is the holding of irrational normative beliefs about racial differences, motivated by racial hatred or prejudice.

I think this does the best at avoiding making the category too large, but it may be too strong and keep out some plausible cases of racism. I’d like to hear suggestions for improvements on this definition, but for now I’ll leave it there. One potential take-away is that the word ‘racism’ is a nasty combination of highly negatively charged and ambiguous, and that such words are best treated with caution, especially when applied them to edge cases.