On philosophical progress

A big question in the philosophy of philosophy is whether philosophers make progress over time. One relevant piece of evidence that gets brought up in these discussions is the lack of consensus on age old questions like free will, normative ethics, and the mind body problem. If a discipline is progressing steadily towards truth with time, the argument goes, then we should expect that questions that have been discussed for thousands of years should be more or less settled by now. After all, that is what we see in the hard sciences; there are no lingering disputes over the validity of vitalism or the realm of applicability of Newtonian mechanics.

There are a few immediate responses to this line of argument. It might be that the age old questions of philosophy are simply harder than the questions that get addressed by physicists or biologists. “Harder” doesn’t mean “requires more advanced mathematics to grapple with” here, but something more like “it’s unclear what even would count as a conclusive argument for one position or another, and therefore much less clear how to go about building consensus.” Try to imagine what sort of argument would convince you of the nonexistence of libertarian free will with the same sort of finality as a demonstration of time dilation convinces you of the inadequacy of nonrelativistic mechanics.

A possible rejoinder at this point would be to take after the logical positivists and deny the meaningfulness or at least truth-aptness of the big questions of philosophy as a whole. This may go too far; it may well be that a query is meaningful but, due to certain epistemic limitations of ours, forever beyond our ability to decide. (We know for sure that such queries can exist, due to Gödelian discoveries in mathematics. For instance, we know of the existence of a series of numbers that are perfectly well defined, but for which no algorithm can exist to enumerate all of them. The later numbers in this sequence will forever be a mystery to us, and not for lack of meaningfulness.)

I think that the roughly correct position to take is that science is largely about examining empirical facts-of-the-matter, whereas philosophy is largely about analyzing and refining our conceptual framework. While we have a fairly clear set of standards for how to update theories about the empirical world, we are not in possession of such a set of standards for evaluating different conceptual frameworks. The question of “what really are the laws governing the behavior of stuff out there” has much clearer truth conditions than a question like “what is the best way to think about the concepts of right and wrong”; i.e. It’s clearer what counts as a good answer and what counts as a bad answer.

When we’re trying to refine our concepts, we are taking into account our pre-theoretical intuitions (e.g. any good theory of the concept of justice must have something to do with our basic intuitive conception of justice). But we’re not just satisfied to describe the concept solely as the messy inconsistent bundle of intuitions that constitute our starting position on it. We also aim to describe the concept simply, by developing a “theory of justice” that relies on a small set of axioms and from which (the hope is) the rest of our conclusions about justice follow. We want our elaboration of the concept to be consistent, in that we shouldn’t simultaneously affirm that A is an instance of the concept and that A is not an instance of the concept. Often we also want our theory to be precise, even when the concept itself has vague boundaries.

Maybe there are other standards besides these, intuitiveness, simplicity, consistency, and precision. And the application of these standards is very rarely made explicit. But one thing that’s certain is that different philosophers have different mixes of these values. One philosopher might value simplicity more or less than another, and it’s not clear that one of them is doing something wrong by having different standards. Put another way, I’m not convinced that there is one unique right set of standards for conceptual refinement.

We may want to be subjectivists to some degree about philosophy, and say that there are a range of rationally permissible standards for conceptual refinement, none better than any other. This would have the result that on some philosophical questions, multiple distinct answers may be acceptable but some crazy enough answers are not. Maybe compatibilism and nihilism are acceptable stances on free will but libertarianism is not. Maybe dualism and physicalism are okay but not epiphenomenalism. And so on.

This view allows for a certain type of philosophical progress, namely the gradual ruling out of some philosophical positions as TOO weird. It also allows for formation of consensus, through the discovery of philosophical positions that are the best according to all or most of the admissible sets of standards. I think that one example of this would be the relatively recent rise of Bayesian epistemology in philosophy of science, and in particular the Bayesian view of scientific evidence as being quantified by the Bayes factor. In brief, what does it mean to say that an observation O gives evidence for a hypothesis H? The Bayesian not only has an answer to this, but to the more detailed question of to what degree O gives evidence for H. The quantity is cr(O | H) / cr(O), where cr(.) is a credence function encoding somebody’s beliefs before observing O. If this quantity is equal to 1, then O is no evidence for H. If it is greater than 1, then O is evidence for H. And if it’s less than 1, then O is evidence against H.

Not everything in Bayesian epistemology is perfectly uncontroversial, but I would argue that on this particular issue – the issue of how to best formalize the notion of scientific evidence – the Bayesian definition survives all its challenges unscathed. What are some other philosophical questions on which you think there has been definite progress?

Advertisements

Logic on Planet Zorko

A group of Zorkan mathematicians are sitting around having a conversation in a language that you are unfamiliar with. You are listening in with a translator. This translator is an expert in formal logic, and has decided to play the following game with you. He says:

“After listening to the full conversation, I will translate all the sentences that were said for you. But I won’t translate them into English; I want something more universal. Instead, I will choose a formal language that captures the mathematical content of all the sentences said, while leaving out the vagaries and subtleties of the Zorkan language. I will describe to you the semantics of the formal language I choose, if you don’t already know it.”

“Furthermore,” (says the translator) “I happen to be intimately familiar with Zorkan society and culture. The Zorkans are having a discussion about one particular mathematical structure, and I know which one that is. The mathematicians are all fantastically precise reasoners, such that none of them ever says a sentence that is false of the structure that they are discussing.”

(So for instance if they are talking about the natural numbers, then no mathematician will say “0 = 1”, and if they are talking about abelian groups, then no mathematician will say “∃x∃y (xy ≠ yx)”. But they could say “∃x∃y (xy ≠ yx)” if they are talking about non-abelian groups.)

You know nothing about Zorkan psychology, besides that the Zorkan way of life is so utterly foreign to you that you cannot reliably assume that the mathematical structures that come most naturally to you will also come naturally to them. It might be, for instance, that nonstandard models of arithmetic are much more intuitive to them than the natural numbers. You cannot assume that the structure they are discussing is the one that you think is “most natural”; you can only conclude this if one of them says a sentence that is true of that model and no others.

The conversation finishes, and you are tasked with answering the following two questions:

(1) What structure are they talking about?
(2) Can you come up with a verification procedure for the mathematicians’ sentences (including possible future sentences they might say on the topic)?

So, that’s the setup. Now, the question I want you to consider is the following: Suppose that the structure that the mathematicians have in mind is actually the natural numbers. Is there some conversation, any conversation at all (even allowing infinitely long conversations, and uncomputable conversations – conversations which cannot be produced as the output of any Turing machine), that the mathematicians could have, and some translation of this conversation, such that you can successfully answer both (1) and (2)? If so, what is that conversation? And if not, then why not?

✯✯✯

Let’s work out some simple examples.

Example 1

Suppose the conversation is translated into a propositional language with three atomic propositions {P, Q, R}.

Mathematician A: “P ∨ Q”
Mathematician B: “(Q ∨ R) → (¬P)”
Mathematician C: “R”

From this conversation, you can deduce that the model they are talking about is the one that assigns “False” to P, “True” to Q, and “True” to R.

M: {P is false, Q is true, R is true}

This is the answer to the question 1!

As for the second question, we want to know if there’s some general procedure that produces all the future statements the mathematicians could make. For instance, the set generated by our procedure should include (Q ∧ R) but not (Q ∧ P).

It turns out that such a procedure does exist, and is not too difficult to write out and implement.

Example 2

Take the above conversation and modify it slightly:

Mathematician A: “P ∨ Q”
Mathematician B: “(Q ∨ R) → (¬P)”
Mathematician C: “¬R”

If you work it out, you’ll see that question 1 can no longer be answered unambiguously. The problem is that there are multiple models of the sentences that the mathematicians are saying:

M1: {P is false, Q is true, R is false}
M2: {P is true, Q is false, R is false}

So even though they have one particular structure in mind, you don’t have enough information from their conversation to figure out exactly what that structure is.

Now let’s think about the answer to question 2. We don’t know whether the mathematicians are thinking about M1 or M2, and M1 and M2 differ in what truth value they assign the proposition P. So we can’t construct an algorithm that will generate the set of all their possible future statements, as this would require us to know, in particular, whether P is true or false in the model that they have in mind.

We might suspect that this holds true generally: if you can’t answer question 1, then you won’t be able to answer question 2 either. But we might also wonder: if we can answer question 1, then can we also always answer question 2?

The answer is no, as the next example will show.

Example 3

For this conversation, the translation is in second-order logic. This will allow us to talk about more interesting mathematical structures than before; namely, structures that have a domain of objects on which functions and predicates can act. In particular, we’re in a second-order language with one constant symbol “c” and one function symbol “f”. Here’s the conversation:

Mathematician A: ¬∃x (f(x) = c)
Mathematician B: ¬∃x∃y ((f(x) = f(y)) ∧ ¬(x = y))
Mathematician C: ∀R (R(c) ∧ ∀x(R(x) → R(f(x))) → ∀x R(x))

Notice that the only truly second-order sentence is the third one, in which we quantify over a predicate variable R rather than an individual variable x, y, z, …. But the second-order status of this sentence it makes it that the translator could not have possibly translated this conversation into a first-order language, much less a propositional language.

This time, questions 1 and 2 are much harder to answer than before. But if you work it out, you’ll see that there is exactly one mathematical structure that satisfies all three of the mathematicians’ statements. And that structure is the natural numbers!

So, we know exactly what structure the mathematicians have in mind. But can we also answer question 2 in the positive? Can we produce some verification procedure that will allow us to generate all the future possible sentences the mathematicians could say? Unfortunately, the answer is no. There is no sound and complete proof system for second-order logic, so in particular, we have no general algorithm for producing all the truths in this second order language. So sad.

Example 4

Now let’s move to first-order logic for our final example. The language of translation will be a first order language with a constant symbol for every natural number {0,1,2,3,…}, function symbols for ordinary arithmetic {+, ×}, and relation symbols for orders {≥}

Imagine that the conversation consists of literally all the first-order sentences in the language that are true of the natural numbers. Anything which you can say in the language, and which is true as a statement about ℕ, will be said at some point. This will obviously be a very long conversation, and in fact infinitely long, but that’s fine. It will include sentences like “0 ≠ 1”, “0 ≠ 2”, “0 ≠ 3”, and so on.  (These Zorkans are extremely thorough.)

Given this conversation, can we answer (1) and (2)? Take a guess; the answer may surprise you!

It turns out that even though we can answer (2) positively – we can actually produce an algorithm that will generate one-by-one all the possible future statements of the mathematicians (which really means all the sentences in the language that are true of the natural numbers), we cannot answer (1) positively! There are multiple distinct mathematical structures that are compatible with the entirety of true statements about natural numbers in the language. Earlier we hypothesized that any time we have a negative answer to (1), we will also have a negative answer to (2). But this is not true! We can verify all the true statements about natural numbers in the language… without even knowing that we’re actually talking about the natural numbers! This is an important and unintuitive consequence of the expressive limitations (and in particular, of the compactness) of first-order logic.

The Takeaway

We had an example where we could answer both (1) and (2) for a simple mathematical structure (a model of propositional logic). And we saw examples for natural numbers where we could answer (1) but not (2), as well as examples where we could answer (2) but not (1). But we haven’t yet seen an example for natural numbers where we had both (1) and (2). This is no coincidence!

It is actually a consequence of the theorem I proved and discussed in my last post that no such such conversation can exist. When structures at least as complicated as the natural numbers are being discussed in some language (call it L), you cannot simultaneously (1) know for sure what structure is being talked about and (2) have an algorithmic verification system for L-sentences about the structure.

Crazy conditionals

It’s well known that the material implication → of propositional logic does not do a perfect job of capturing what we mean when we make “if… then…” statements in English. The usual examples of failure rest on the fact that any material conditional with a false antecedent is vacuously true (so “if 2 is odd then 2 is even” turns out to be true). But over time, philosophers have come up with a whole lot of different ways in which → can catch us by surprise.

Here’s a list of some such cases. In each case, I will present an argument using if…then… statements that is clearly invalid, but which is actually valid in propositional logic if the if…then… statements are translated as the material conditional!

1. Harper

If I put sugar into my coffee, it will taste fine.
Therefore, if I put sugar and motor oil into my coffee, it will taste fine.

S → F
(S ∧ M) → F

2. Distributivity

If I pull both switch A and switch B, the engine will start.
Therefore, either the engine will start if I pull switch A or the engine will start if I pull switch B.

(A ∧ B) → S
(A → S) ∨ (B → S)

3. Transitivity

If Biden dies before the election, Trump will win.
If Trump wins the election, Biden will retire to his home.
Therefore, if Biden dies before the election, Biden will retire to his home.

B → T
T → R
B → R

4. Principle of Explosion

Either zombies will rise from the ground if I bury a chicken head in my backyard, or zombies will rise from the ground if I don’t bury a chicken head in my backyard.

(B → D) ∨ (¬B → D) is a tautology

5. Contraposition

If I buy a car, I won’t buy a Pontiac.
Therefore, if I buy a Pontiac, I won’t buy a car.

C → ¬P
P → ¬C

6. Simplification

If John is in London then he’s in England, and if he’s in Paris then he’s in France.
Therefore, either (1) if John’s in London he’s in France or (2) if John’s in Paris then he’s in England.

(L → E) ∧ (P → F)
(L → F) ∨ (P → E)

7. False Antecedent

It’s not the case that if God exists then human life is a product of random chance.
Therefore, God exists.

¬(G → C)
G

8. True Consequent

If I will have eternal life if I believe in God, then God must exist.
I do not believe in God.
Therefore, God exists.

(B → E) → G
~B
G

You can check for yourself that each of these is logically valid! Can you figure out what’s going wrong in each case?

Describing the world

Wittgenstein starts his Tractatus Philosophicus with the following two sentences.

1. The world is everything that is the case.

1.1 The world is the totality of facts, not of things.

Let’s take him up on this suggestion and see how far we get. In the process, we’ll discover some deep connections to theorems in mathematical logic, as well as some fascinating limitations on the expressive powers of propositional and first order logic.

We start out with a set of atomic propositions. For a very simple world, we might only need a finite number of these: “Particle 1 out of 3 has property 1 out of 50”, “Particle 2 of 3 has property 17 out of 50”, and so on. More realistically, the set of atomic propositions will be infinite (countable if the universe doesn’t have any continuous properties, and uncountable otherwise).

For simplicity, we’ll imagine labeling our set of atomic propositions P1, P2, P3, and so on (even though this entails that there are at most countably many, nothing important will rest on this assumption.) We combine these atomic propositions with the operators of propositional logic {(, ), ¬, ∧, ∨, →}. This allows us to build up more complicated propositions, like ((P7∧P2)→(¬P13)). This will be the language that we use to describe the world.

Now, the way that the world is is just a consistent assignment of truth values to the set of all grammatical sentences in our language. For example, one simple assignment of truth values is the one that assigns “True” to all atomic propositions. Once we’ve assigned truth values to all the atomic propositions, we get the truth values for the rest of the set of grammatical sentences for free, by the constraint that our truth assignment be consistent. (For instance, if P1 and P2 are both true, then (P1∧P2) must also be true.)

Alright, so the set of ways the world could be corresponds to the set of truth assignments over our atomic propositions. The final ingredient is the notion that we can encode our present knowledge of the world as a set of sentences. Maybe we know by observation that P5 is true, and either P2 or P3 is true but not both. Then to represent this state of knowledge, we can write the following set of sentences:

{P5, (P2∨P3), ¬(P2∧P3)}

Any set of sentences picks out a set of ways the world could be, such that each of these possible worlds is compatible with that knowledge. If you know nothing at all, then the set of sentences representing your knowledge will be the empty set {}, and the set of possible worlds compatible with your knowledge will be the set of all possible worlds (all possible truth assignments). On the other extreme, you might know the truth values of every atomic proposition, in which case your state of knowledge uniquely picks out one possible world.

In general, as you add more sentences to your knowledge-set, you cut out more and more possible worlds. But this is not always true! Ask yourself what the set of possible worlds corresponding to the set {(P1∨¬P1), (P2∨¬P2), (P3∨¬P3)} is. Since each of these sentences is a tautology, no possible worlds are eliminated! So our set of possible worlds is still the set of all worlds.

Now we get to an interesting question: clearly for any knowledge-set of sentences, you can express a set of possible worlds consistent with that knowledge set. But is it the case that for any set of possible worlds, you can find a knowledge-set that uniquely picks it out? If I hand you a set of truth assignment functions and ask you to tell me a set of propositions which are consistent with that set of worlds and ONLY that set, is that always possible? Essentially, what we’re asking is if all sets of possible worlds are describable.

We’ve arrived at the main point of this essay. Take a minute to ponder this and think about whether it’s possible, and why/why not! For clarification, each sentence can only be finitely long. But! You’re allowed to include an infinity of sentences.

(…)

(Spoiler-hiding space…)

(…)

If there were only a finite number of atomic propositions, then you could pick out any set of possible worlds with just a single sentence in conjunctive normal form. But when we start talking about an infinity of atomic propositions, it turns out that it is not always possible! There are sets of possible worlds that are literally not describable, even though our language includes the capacity to describe each of those words and we’re allowed to include an infinite set of sentences.

There’s a super simple proof of this. Let’s give a name to the cardinality of the set of sentences: call it K. (We’ve been tacitly acting as if the cardinality is countable this whole time, but that doesn’t actually matter.) What’s the cardinality of the set of all truth assignments?

Well, each truth assignment is a function from all sentences to {True, False}. And there are 2K such assignments. 2K is strictly larger than K, so there are more possible worlds than there are sentences. Now, the cardinality of the set of sets of sentences is also 2K. But the set of SETS of truth of assignments is 22^K!

What this means is that we can’t map sets of sentences onto sets of truth assignments without leaving some things out! This proof carries over to predicate logic as well. The language for both propositional and predicate logic is unable to express all sets of possible worlds corresponding to that language!

I love this result. It’s the first hint in mathematical logic that syntax and semantics can come apart.

That result is the climax of this post. What I want to do with the rest of this post is to actually give an explicit example of a set of truth assignments that are “indescribable” by any set of sentences, and to prove it. Warning: If you want to read on, things will get a bit more technical from here.

Alright, so we’ll use a shortcut to denote truth assignments. A truth assignment will be written as a string of “T”s and “F”s, where the nth character corresponds to how the truth assignment evaluates Pn. So the all-true truth assignment will just be written “TTTTTT…” and the all-false truth assignment will be written “FFFFF…”. The truth assignment corresponding to P1 being false and everything else true will be written “FTTTTT…”. And so on.

Now, here’s our un-describable set of truth assignments. {“FFFFFF…”, “TFFFFF…”, “TTFFFF…”, “TTTFFF…”, …}. Formally, define Vn to be the truth assignment that assigns “True” to every atomic proposition up to and including Pn, and “False” to all others. Now our set of truth assignments is just {Vn | n ∈ ℕ}.

Let’s prove that no set of sentences uniquely picks out this set of truth assignments. We prove by contradiction. Suppose that we could find a set of sentences that uniquely pick out these truth assignments and none other. Let’s call this set A. Construct a new set of sentences A’ by appending all atomic propositions A: A’ = A ∪ {P1, P2, P3, …}.

Is there any truth assignment that is consistent with all of A’? Well, we can answer this by using the Compactness Theorem: A’ has a truth assignment if and only if every finite subset of A’ has a truth assignment. But every finite subset of A’ involves sentences from A (which are consistent with Vn for each n by assumption), and a finite number of atomic propositions. Since each finite subset of A’ is only asserting the truth of a finite number of atomic sentences, we can always find a truth assignment Vk in our set that is consistent with it, by choosing one that switches to “False” long after the last atomic proposition that is asserted by our finite subset.

This means that each finite subset of A’ is consistent with at least one of our truth assignments, which means that A’ is consistent with at least one of our truth assignments. But A’ involves the assertion that all atomic propositions are true! The only truth assignment that is consistent with this assertion is the all-true assignment! And is that truth assignment in our set? No! And there we have it, we’ve reached our contradiction!

We cannot actually describe a set of possible worlds in which either all atomic propositions are false, or only the first is true, or only the first two are true, or only the first three are true, and so on forever. But this might prompt the question: didn’t you just describe it? How did you do that, if it’s impossible? Well, technically I didn’t describe it. I just described the first four possibilities and then said “and so on forever”, assuming that you knew what I meant. To have actually fully pinned down this set of possible worlds, I would have had to continue with this sentence forever. And importantly, since this sentence is a disjunction, I could not split this infinite sentence into an infinite set of finite sentences. This fundamental asymmetry between ∨ and ∧ is playing a big role here: while an infinite conjunction can be constructed by simply putting each clause in the conjunction as a separate sentence, an infinite disjunction cannot be. This places a fundamental limit on the ability of a language with only finite sentences to describe the world.

Moving Naturalism Forward: Eliminating the macroscopic

Sean Carroll, one of my favorite physicists and armchair philosophers, hosted a fantastic conference on philosophical naturalism and science, and did the world a great favor by recording the whole thing and posting it online. It was a three-day long discussion on topics like the nature of reality, emergence, morality, free will, meaning, and consciousness. Here are the videos for the first two discussion sections, and the rest can be found by following Youtube links.

 

Having watched through the entire thing, I have updated a few of my beliefs, plan to rework some of my conceptual schema, and am puzzled about a few things.

A few of my reflections and take-aways:

  1. I am much more convinced than before that there is a good case to be made for compatibilism about free will.
  2. I think there is a set of interesting and challenging issues around the concept of representation and intentionality (about-ness) that I need to look into.
  3. I am more comfortable with intense reductionism claims, like “All fact about the macroscopic world are entailed by the fundamental laws of physics.”
  4. I am really interested in hearing Dan Dennett talk more about grounding morality, because what he said was starting to make a lot of sense to me.
  5. I am confused about the majority attitude in the room that there’s not any really serious reason to take an eliminativist stance about macroscopic objects.
  6. I want to find more details about the argument that Simon DeDeo was making for the undecidability of questions about the relationship between macroscopic theories and microscopic theories (!!!).
  7. There’s a good way to express the distinction between the type of design human architects engage in and the type of design that natural selection produces, which is about foresight and representations of reasons. I’m not going to say more about this, and will just refer you to the videos.
  8. There are reasons to suspect that animal intelligence and capacity to suffer are inversely correlated (that is, the more intelligent an animal, the less capacity to suffer it likely has). This really flips some of our moral judgements on their head. (You must deliver a painful electric shock to either a human or to a bird. Which one will you choose?)

Let me say a little more about number 5.

I think that questions about whether macroscopic objects like chairs or plants really REALLY exist, or whether there are really only just fermions and bosons are ultimately just questions about how we should use the word “exist.” In the language of our common sense intuitions, obviously chairs exist, and if you claim otherwise, you’re just playing complicated semantic games. I get this argument, and I don’t want to be that person that clings to bizarre philosophical theses that rest on a strange choice of definitions.

But at the same time, I see a deep problem with relying on our commonsense intuitions about the existence of the macro world. This is that as soon as we start optimizing for consistency, even a teeny tiny bit, these macroscopic concepts fall to pieces.

For example, here is a trilemma (three statements that can’t all be correct):

  1. The thing I am sitting on is a chair.
  2. If you subtract a single atom from a chair, it is still a chair.
  3. Empty space is not a chair.

These seem to me to be some of the most obvious things we could say about chairs. And yet they are subtly incoherent!

Number 1 is really shorthand for something like “there are chairs.” And the reason why the second premise is correct is that denying it requires that there be a chair such that if you remove a single atom, it is no longer a chair. I take it to be obvious that such things don’t exist. But accepting the first two requires us to admit that as we keep shedding atoms from a chair, it stays a chair, even down to the very last atom. (By the way, some philosophers do actually deny number 2. They take a stance called epistemicism, which says that concepts like “chair” and “heap” are actually precise and unambiguous, and there exists a precise point at which a chair becomes a non-chair. This is the type of thing that makes me giggle nervously when reflecting on the adequacy of philosophy as a field.)

As I’ve pointed out in the past, these kinds of arguments can be applied to basically everything in the macroscopic world. They wreak havoc on our common sense intuitions and, to my mind, demand rejection of the entire macroscopic world. And of course, they don’t apply to the microscopic world. “If X is an electron, and you change its electric charge a tiny bit, is it still an electron?” No! Electrons are physical substances with precise and well-defined properties, and if something doesn’t have these properties, it is not an electron! So the Standard Model is safe from this class of arguments.

Anyway, this is all just to make the case that upon close examination, our commonsense intuitions about the macroscopic world turn out to be subtly incoherent. What this means is that we can’t make true statements like “There are two cars in the garage”. Why? Just start removing atoms from the cars until you get to a completely empty garage. Since no single-atom change can make the relevant difference to “car-ness”, at each stage, you’ll still have two cars!

As soon as you start taking these macroscopic concepts seriously, you find yourself stuck in a ditch. This, to me, is an incredibly powerful argument for eliminativism, and I was surprised to find that arguments like these weren’t stressed at the conference. This makes me wonder if this argument is as powerful as I think.

Defining racism

How would you define racism?

I’ve been thinking about this lately in light of some of the scandal around research into race and IQ. It’s a harder question than I initially thought; many of the definitions that pop to mind end up being either too strong or too weak. The term also functions differently in different contexts (e.g. personal racism, institutional racism, racist policies). In this post, I’m specifically talking about personal racism – that term we use to refer to the beliefs and attitudes of those like Nazis or Ku Klux Klan members (at the extreme end).

I’m going to walk through a few possible definitions. This will be fairly stream-of-consciousness, so I apologize if it’s not incredibly profound or well-structured.

Definition 1 Racism is the belief in the existence of inherent differences between the races.

‘Inherent’ is important, because we don’t want to say that somebody is racist for acknowledging differences that can ultimately be traced back to causes like societal oppression. The problem with this definition is that, well, there are inherent differences between the races.

The Chinese are significantly shorter than the Dutch. Raising a Chinese person in a Dutch household won’t do much to equalize this difference. What’s important, it seems, is not the belief in the existence of inherent differences, but instead the belief in the existence of inherent inferiorities and superiorities. So let’s try again.

Definition 2 Racism is the belief in the existence of inherent racial differences that are normatively significant.

This is pretty much the dictionary definition of the term “racism”. While it’s better, there are still some serious problems. Let’s say that somebody discovered that the Slavs are more inherently prone to violence than, say, Arabs. Suppose that somebody ran across this fact, and that this person also held the ethical view that violent tendencies are normatively important. That is, they think that peaceful people are ethically superior to violent people.

If they combine this factual belief with this seemingly reasonable normative belief, they’ll end up being branded as a racist, by our second definition. This is clearly undesirable… given that the word ‘racism’ is highly normatively loaded, we don’t want it to be the case that somebody is racist for believing true things. In other words, we probably don’t want our definition of racism to ever allow it to be the right attitude to take, or even a reasonable attitude to take.

Maybe the missing step is the generalization of attitudes about Slavs and Arabs to individuals. This is a sentiment that I’ve heard fairly often… racism is about applying generalizations about groups to individuals (for instance, racial profiling). Let’s formalize this:

Definition 3 Racism is about forming normative judgments about individuals’ characteristics on the basis of beliefs about normative group-level differences.

This sounds nice and all, but… you know what another term for “applying facts about groups to individuals” is? Good statistical reasoning.

If you live in a town composed of two distinct populations, the Hebbeberans and the Klabaskians, and you know that Klabaskians are on average twenty times more likely than Hebbeberans to be fatally allergic to cod, then you should be more cautious with serving your extra special cod sandwich to a Klabaskian friend than to a Hebbeberan.

Facts about populations do give you evidence about individuals within those populations, and the mere acknowledgement of this evidence is not racist, for the same reason that rationality is not racist.

So if we don’t want to call rationality racist, then maybe our way out of this is to identify racism with irrationality.

Definition 4 Racism is the holding of irrational beliefs about normative racial differences.

Say you meet somebody from Malawi (a region with an extremely low average IQ). Your first rational instinct might be to not expect too much from them in the way of cognitive abilities. But now you learn that they’re a theoretical physicist who’s recently been nominated for a Nobel prize for their work in quantum information theory. If the average IQ of Malawians is still factoring in at all to your belief about this person’s intelligence, then you’re being racist.

I like this definition a lot better than our previous ones. It combines the belief in racial superiority with irrationality. On the other hand, it has problems as well. One major issue is that there are plenty of cases of benign irrationality, where somebody is just a bad statistical reasoner, but not motivated by any racial hatred. Maybe they over-updated on some piece of information, because they failed to take into account an important base-rate.

Well, the base-rate fallacy is one of the most common cognitive biases out there. Surely this isn’t enough to make them a racist? What we want is to capture the non-benign brand of irrational normative beliefs about race – those that are motivated by hatred or prejudice.

Definition 5 Racism is the holding of irrational normative beliefs about racial differences, motivated by racial hatred or prejudice.

I think this does the best at avoiding making the category too large, but it may be too strong and keep out some plausible cases of racism. I’d like to hear suggestions for improvements on this definition, but for now I’ll leave it there. One potential take-away is that the word ‘racism’ is a nasty combination of highly negatively charged and ambiguous, and that such words are best treated with caution, especially when applied them to edge cases.

Those who have forgotten words

The fish trap exists because of the fish. Once you’ve gotten the fish you can forget the trap. The rabbit snare exists because of the rabbit. Once you’ve gotten the rabbit, you can forget the snare. Words exist because of meaning. Once you’ve gotten the meaning, you can forget the words. Where can I find a man who has forgotten words so I can talk with him?

― Zhuangzi

Metaphysics and fuzziness: Why tables don’t exist and nobody’s tall

  • The tallest man in the world is tall.
  • If somebody is one nanometer shorter than a tall person, then they are themselves tall.

If the word tall is to mean anything, then it must imply at least these two premises. But from the two it follows by mathematical induction that a two-foot infant is tall, that a one-inch bug is tall, and worst, that a zero-inch tall person is tall. Why? If the tallest man is the world is tall (let’s name him Fred), then he would still be tall if he was shrunk by a single nanometer. We can call this new person ‘Fred – 1 nm’. And since ‘Fred – 1 nm’ is tall, so is ‘Fred – 2 nm’. And then so is ‘Fred – 3 nm’. Et cetera until absurdity ensues.

So what went wrong? Surely the first premise can’t be wrong – who could the word apply to if not the tallest man in the world?

The second seems to be the only candidate for denial. But this should make us deeply uneasy; the implication of such a denial is that there is a one-nanometer wide range of heights, during which somebody makes the transition from being completely not tall to being completely tall. Somebody exactly at this line could be wavering back and forth between being tall and not every time a cell dies or divides, and every time a tiny draft rearranges the tips of their hairs.

Let’s be clear just how tiny a nanometer really is: A sheet of paper is about a hundred thousand nanometers thick. That’s more than the number of inches that make up a mile. If the word ‘tall’ means anything at all, this height difference just can’t make a difference in our evaluation of tallness.

Tall: Not.png

So we are led to the conclusion: Fred is not tall. And if the tallest man on the planet isn’t tall, then nobody is tall. Our concept of tallness is just a useful idea that falls apart on any close examination.

This is the infamous Sorites paradox. What else is vulnerable to versions of the Sorites paradox? Almost every concept that we use in our day to day life! Adulthood, intelligence, obesity, being cold, personhood, wealthiness, and on and on. It’s harder to look for concepts that aren’t affected than those that are!

The Sorites paradox is usually seen in discussions of properties, but it can equally well be applied to discussions of objects. This application leads us to a view of the world that differs wildly from our common sense view. Let’s take a standard philosophical case study: the table. What is it for something to be a table? What changes to a table make it no longer a table?

Whatever answers these questions about tables have, they will hopefully embody our common sense notions about tables and allow us to make the statements that we ordinarily want to make about tables. One such common sense notion involves what it takes for a table to cease being a table; presumably little changes in the table are allowed, while big changes (cleaving it into small pieces) are not. But here we run into the problem of vagueness.

If X is a table, then X would still be a table if it lost a tiny bit of the matter constituting it. Like before, we’ll take this to the extreme to maximize its intuitive plausibility: If a single atom is shed from a table, it’s still a table. Denial of this is even worse than it was before; if changes by single atoms could change table-hood, we would be in a position where we should be constantly skeptical of whether objects are tables, given the microscopic changes that are happening to ordinary tables all the time.

sorites.png

And so we are led inevitably to the conclusion that single atoms are tables, and even that empty space is a table. (Iteratively remove single atoms from a table until it has become arbitrarily small.) Either that, or there are no tables. I take this second option to be preferable.

Fragment

How far do these arguments reach? It seems like most or all macroscopic objects are vulnerable to them. After all, we don’t change our view of macroscopic objects that undergo arbitrarily small losses of constituent material. And this leads us to a worldview in which the things that actually exist match up with almost none of the things that our common-sense intuitions tell us exist: tables, buildings, trees, planets, computers, people, and so on.

But is everything eliminated? Plausibly not. What can be said about a single electron, for instance, that would lead to a continuity premise? Probably nothing; electrons are defined by a set of intrinsic properties, none of which can differ to any degree while the particle still remains an electron. In general, all of the microscopic entities that are thought to fundamentally compose everything else in our macroscopic world will be (seemingly) invulnerable to attack by a version of the Sorites paradox.

The conclusion is that some form of eliminativism is true (objects don’t exist, but their lowest-level constituents do). I think that this is actually the right way to look at the world, and is supported by a host of other considerations besides those in this post.

Closing comments

  • The subjectivity of ‘tall’ doesn’t remove the paradox. What’s in question isn’t the agreement between multiple people about what tall means, but the coherency of the concept as used by a single person. If a single person agrees that Fred is tall, and that arbitrarily small height differences can’t make somebody go from not tall to tall, then they are led straight into the paradox.
  • The most common response to this puzzle I’ve noticed is just to balk and laugh it off as absurd, while not actually addressing the argument. Yes, the conclusion is absurd, which is exactly why the paradox is powerful! If you can resolve the paradox and erase the absurdity, you’ll be doing more than 2000 years of philosophers and mathematicians have been able to do!

Is [insert here] a religion?

(Everything I’m saying here is based on experiences in a few religious studies classes I’ve taken, some papers that I’ve read, and some conversations with religious studies people. The things I say might not be actually be representative of the aggregate of religious studies scholars, though Google Scholar would seem to provide some evidence for it.)

Religious studies people tend to put a lot of emphasis on the fact that ‘religion’ is a fuzzy word. That is, while there are some organizations that everybody will agree are religions (Judaism, Christianity, Islam), there are edge cases that are less clear (Unitarian Universalism, Hare Krishnas, Christian Science). In addition, attempts to lay out a set of necessary and sufficient conditions for membership in the category “religion” tend to either let in too many things or not enough things.

For some reason this is taken to be a very significant fact, and people solemnly intone things like “Is nationalism a type of religion?” and “Isn’t atheism really just the new popular religion for the young?”. Sociologists spend hours arguing with each other about different definitions of religion, and invoking new typologies to distinguish between religions and non-religions.

The strange thing about this is that religion is not at all unique in this regard. Virtually every word that we use is similarly vague, with fuzzy edges and ambiguities. That’s just how language works. Words don’t attain meanings through careful systematic processes of defining necessary and sufficient conditions. Words attain meanings by being attached to clusters of concepts that intuitively feel connected, and evolve over time as these clusters shift and reshape themselves.

There is a cluster of important ideas about language, realization of which can keep you from getting stuck in philosophical dead ends. The vagueness inherent to much of natural language is one of these ideas. Another is that semantic prescriptivism is wrong. Humans invent the mapping of meanings to words, we don’t pluck it out of an objective book of the Universe’s Preferred Definitions of Terms. When two people are arguing about what the word religion means, they aren’t arguing about a matter of fact. There are some reasons why such an argument might be productive – for instance, there might be pragmatic reasons for redefining words. But there is no sense in which the argument is getting closer to the truth about what the actual meaning of ‘religion’ is.

Similarly, every time somebody says that football fans are really engaging in a type of religious ritual, because look, football matches their personal favorite list of sufficient conditions for being a religion, they are confused about semantic prescriptivism. At best, such comparisons might reveal previously unrecognized features of football fanaticism. But these comparisons can also end up serving to cause mistaken associations to carry over to the new term from the old. (Hm, so football is a religion? Well, religions are about supernatural deities, so Tom Brady must be a supernatural deity of the football religion. And religious belief tends to be based on faith, so football fans must be irrationally hanging on to their football-shaped worldview.)

It seems to me that scholars of religious studies have accepted the first of these ideas, but are still in need of recognizing the second. It also seems like there is a similar phenomenon going on in sociological discussions of racial terms and gender terms, where the ordinary fuzziness of language is treated as uniquely applying to these terms, taken as exceptionally important, and analyzed to death. I would be interested to hear hypotheses for why this type of thing happens where it does.

Race, Ethnicity, and Labels

(This post is me becoming curious about the variety of different opinions on racial labels, spending far too many hours researching the topic, and writing up what I find.)

One thing that I find interesting is that basically every minority ethnic and racial group in the United States has constantly dealt with terminological disputes about their proper group name.

One possible explanation for this constant turn-over was given by disability rights activist Evan Kemp, who wrote:

As long as a group is ostracized or otherwise demeaned, whatever name is used to designate that group will eventually take on a demeaning flavor and have to be replaced. The designation will keep changing every generation or so until the group is integrated into society. Whatever name is in vogue at the point of social acceptance will be the lasting one.

If this is the right explanation, then maybe we’d be able to measure the relative degrees of discrimination faced by different groups on the basis of their ‘terminological velocity’ – how quick a turnover the name for their group has.

Regardless, looking into these issues revealed a bunch of interesting history and weird trivia. So here goes!

***

Native American vs American Indian

A 1995 Census Bureau survey of American Indians found that 49% preferred the term ‘American Indian’ and 37% preferred ‘Native American’. I couldn’t find any more recent polls on this question.

This may seem unusual if you don’t know much about American Indian culture and history. It’s a bit confusing to me; as somebody with a parent born in India, I’m pretty sure that I’m an American Indian.

Why is a term that derives from the geographical error of early European colonists the most favored of all available terms? And why not ‘Native American’? From an outside perspective, ‘Native American’ feels like a respectful term, one that pays homage to the history of American Indians as the original residents of the Americas.

It turns out the answer to these questions comes from a quick look at the history of these terms, which is super fascinating.

‘Native American’ was a term originally used by WASPs in the 1850s to differentiate themselves from Catholic Irish and German immigrants. The anti-immigrant Know-Nothing Party, whose supporters were known for violent riots in Catholic neighborhoods, burning down churches, and tarring and feathering of Catholic priests, was originally known as the Native American Party.

The term fell out of use for a century upon the rise of the anti-slavery movement and subsequent collapse of the Know-Nothings. This time gap probably indicates that the early usage of the term has little current relevance to associations with the term, but I included it anyway. I find it darkly amusing to imagine white anti-Catholic nativists running around calling themselves Native Americans.

The term ‘Native American’ was revived in the civil rights era by anthropologists eager for historical accuracy and disassociation from the negative stereotypes associated with ‘Indian’. This was adopted widely by government agencies, and apparently in doing so picked up a negative connotation.

Prominent Lakota activist Russell Means described the term as “a generic government term used to describe all the indigenous prisoners of the United States.” Some American Indians emphasize a sense of lack of ownership over the term, and feel that it was a “colonial term” given to them by outsiders.

‘American Indian’ is apparently more widely favored. Widespread acceptance of this term dates back to 1968 and the rise of the American Indian Movement (AIM). At a UN conference in 1977, AIM’s International Indian Treaty Council urged collective identification of American Indians with the term.

One argument made for the term is that while the names of other races in America have ‘American’ as their second word (e.g. ‘Asian American’, ‘Arab American’), ‘American Indian’ would have American as its first word, giving American Indians a special distinction. I’m serious, this was a real argument.

‘American Indian’ is etymologically close to ‘Indian’, which dates back to early European colonists that systematically drove American Indian populations out of their homes. Some note derogatory stereotypes from old Western movies associated with ‘cowboys and Indians’, and feel that the association carries over to ‘American Indian’.

Other American Indians say that they would prefer to be identified by their specific tribal nation, feeling that terms like ‘Native American’ and ‘Indian American’ lump all tribes together and ignore important differences in heritage. The problem with this is that there are 562 federally recognized distinct tribes, making this cognitively unfeasible. It’s also just useful to have a term to talk about these tribes in the aggregate.

Interestingly, when I was researching this, I found a Washington Post poll in 2016 that reported that 73% of American Indians felt that the word ‘Redskin’ was not disrespectful, and 80% would not be offended if referred to as a Redskin. A 2004 poll found similar results, with 90% of American Indians saying that the name of the Washington Redskins didn’t bother them. This is significantly more than the percentage of all Americans that don’t find the name offensive, which is around 68%.

I tried to find good arguments against these poll results, and could only find some groundless conspiracy theories suggesting the polls had been infiltrated by white people claiming to be American Indians. In the absence of alternative explanations, I really don’t know what to make of this, besides that it suggests a complete disconnect between American Indian activists and the general American Indian population.

Black vs African American

The 2010 United States Census included “Black, African Am., or Negro” as one of their racial identifications. In response to many complaints and black Americans refusing to select the term, they have now switched to the shorter ‘Black or African American’.

Something that caught my eye was their explanation of this choice, which was that apparently previous research had shown that if polls didn’t allow self-identification as ‘Negro’, a significant number of older African Americans would take the time to write it in under the ‘some other race’ category.

The term ‘Negro’ became popular in the 1920s as a polite term to replace ‘Colored’, which was in turn originally a polite alternative to ‘Nigger’ in the 1900s. An actual argument made for adopting ‘Negro’ was that it was easier to pluralize than ‘Colored’, which required the addition of another word (‘Negroes’ vs ‘Colored people’). Bizarre, but okay!

In 1890, the US Census used a four-way classification: ‘Black’ for those with at least ¾ black blood, ‘mulatto’ from 3/8 to 5/8, ‘quadroon’ for ¼, and ‘octoroon’ for 1/8. Unsurprisingly, this did not catch on.

‘Negro’ was simpler, and quickly became the politically correct and respectful term, used by black leaders like Booker T Washington, Marcus Garvey, W.E.B. Du Bois, and later Martin Luther King Jr. Many black organizations replaced ‘Colored’ in their title with ‘Negro’, with the notable exception of the NAACP.

During the civil rights era, radical and militant black organizations began to attack the term, claiming that it was associated with the history of slavery and racism. ‘Black’ became a term that identified you with radical progressive blacks (think of slogans like ‘Black Power’ and ‘Black is beautiful’), while ‘Negro’ was associated with the status quo and the old guard.

The last US president to use the term ‘Negro’ was Lyndon Johnson, and by 1980 there was a large majority of African Americans in favor of ‘Black’. And of course, in modern times the term ‘Negro’ is commonly perceived as a racial slur. Obama banned the term from usage in federal law in 2016.

Meanwhile ‘Black’ became the standard term employed in surveys and used by black organizations, and having gained popular acceptance, lost its radical connections.

(Quick aside: This looks to me like an instance of what’s called semantic bleaching, where a word weakens in meaning as it increases in usage. My favorite example of this is the phrase ‘God be with you’, which over the years lost its religious connotation and became… ‘goodbye’!)

This lasted until around 1990, when Jesse Jackson announced that ‘Black’ was a term disconnected from cultural heritage, and declared a switch to ‘African American’.

While some organizations changed their names and declared their support for ‘African American’, this didn’t gather the same level of universal acceptance as ‘Black’ had in the 1960s, or indeed ‘Negro’ in the 1900s. The 1995 Census found that 44% of Black Americans still preferred ‘Black’, and only 28% preferred ‘African American’. Some argued that modern African Americans have created a culture that is not tied to Africa, and indeed that there is no coherent concept of a ‘single African culture’.

One paper I read attributed Jackson’s lack of success in making ‘African American’ the universally used term to a missing confrontational intensity that existed in the Black Power movement. For instance, when Malcolm X and other radical black activists challenged the term ‘Negro’, they attacked it harshly and made its usage a social taboo.

Jackson may have lacked the political power to sufficiently mobilize Black Americans. A 2007 Gallup poll found that 61% of Black Americans didn’t care about what term they were described by, reflecting a high level of apathy towards his cause. A 2005 paper found that Black Americans were nearly equally divided between the two.

Currently there’s an uneasy shifting balance between these two terms, where both are acceptable, though sometimes one becomes more acceptable than the other. In my personal experience, I recall a several-year period where I perceived that the term “Black” was becoming increasingly politically incorrect. I later had (and currently have) a sense that this political incorrectness around the term had backed off, keeping it in public acceptance.

Hispanic vs Latino

Americans who trace their roots to Spanish-speaking countries were grouped together by the US government under the umbrella term ‘Hispanic’ in the 1970s. ‘Latino’ later became popular as well, and was first included in the 2000 Census. These terms are defined as synonyms by the U.S. Census Bureau.

Polls indicate that around half of Latinos don’t like either term, and prefer to be identified with their country of origin. When forced to choose, more than twice as many prefer ‘Hispanic’ over ‘Latino’. (Interestingly, Latino friends of mine tell me that they and their Latino friends and family overwhelmingly prefer ‘Latino’ over ‘Hispanic’, which points to some sort of selection bias around me that I don’t understand.)

The federal government officially defines ‘Latino’ not as a race, but an ethnicity. Latinos apparently disagree – 56% claim that is both a race and an ethnicity and 11% that it is a race. Only 19% agree with the official definition!

Both terms ‘Latino’ and ‘Hispanic’ are fairly unique to the United States. Terms that arose from Latino social movements like ‘Chicano’ have never won out among Latinos. This might be in part because of the lack of a strong shared identity – about 70% of Latinos think that there is not a common culture between American Latinos, and instead see a loose group composed of many individual cultures. There’s also a relevant lack of widely-known Latino activists and clear representatives of Latino people to champion these terms.

An older term designed to de-gender the term ‘Latino’ is ‘Latin@’, starting in the 1990s. This was apparently not inclusive enough, as the ‘@’ represents only ‘o’ and ‘a’ and not those that identify with neither. More recently, social justice activists have tried to encourage the adoption of the term ‘Latinx’. This term breaks with the gendered nature of the Spanish language and hardly rollss off the tongue, but has become relatively popular with LGBT activists.

Asian American vs Oriental

The term ‘Oriental’ was prohibited in the same bill in which Obama prohibited the use of the term ‘Negro’ in federal documents. There is a fairly strong consensus at this point that ‘Asian American’ is the appropriate term (though there remains some academic debate about this term).

‘Oriental’ is an old old term, dating back to the late Roman Empire. Over its history, the geographical region it referred to shifted constantly eastward (ad orientalem), from Morocco (yes, at some point it might have been proper to refer to Moroccans as Oriental!) to Egypt and the Levant to India and finally to East and Southeast Asia by the mid-1900s.

The term picked up baggage in the U.S. during the racist campaigns against Asian Americans in the late 1800s and early 1900s, and by now is fairly universally considered a pejorative term.

It was replaced by the term ‘Asian American’, which began to enter into popular use in the 1960s. The US Census definition of ‘Asian American’ still includes Indians, which feels really really wrong to me. I tried and failed to find public opinion polls on how many people feel comfortable with the term ‘Asian’ being applied to Indians.

And others…

The terminological situation of the Roma people is uniquely terrible. They are mostly referred to by the pejorative term ‘Gypsy’, which is essentially synonymous with ‘dangerous thieving wanderer’. The term ‘gypped’, meaning cheated or swindled, also has its origins in this term. They are also commonly referred to by the term ‘Tigan’, another pejorative term that derives from the Greek word for ‘untouchable’.

In a 2013 BBC TV interview, former Romanian prime minister Victor Ponta took care to distinguish Romanians from the Roma, noting that Romanians want to distance themselves from the Roma due to the negative connotations of the similar term.

And in 2010, the Romanian government supported a constitutional amendment legally renaming the Roma to the pejorative ‘Tigan’. (This law was later rejected by the Romanian Senate) Another such amendment was proposed in 2013, this time hoping to ban the self-identification of Roma in Romania as Romanians.

Jewish people are also in an unusual terminological situation. The term ‘Israelite’ was apparently commonly used until the 1947 formation of Israel. While ‘Jew’ is the only remaining commonly used term, there are problems with it. From The American Heritage Dictionary:

It is widely recognized that the attributive use of the word Jew, in phrases such as Jew lawyer or Jew ethics, is both vulgar and highly offensive. In such contexts Jewish is the only acceptable possibility. Some people, however, have become so wary of this construction that they have extended the stigma to any use of Jew as a noun, a practice that carries risks of its own. In a sentence such as There are several Jews on the council, which is unobjectionable, the substitution of a circumlocution like Jewish people or persons of Jewish background may in itself cause offense for seeming to imply that Jew has a negative connotation when used as a noun.

***

All in all, it looks like a really complicated mixture of factors ends up determining how this part of the language evolves.

On the one hand there are syntactic features (like ‘American Indian’ having ‘Indian’ on the right as opposed to the standard left, or ‘Colored’ having a complicated pluralization compared to ‘Negro’).

And on the other hand there are semantic features like the ancient and automatic negative associations with words like ‘dark’ and ‘black’, or the colonial associations tied to the term ‘Indian’.

There are contemporary factors like the existence of a strong shared racial/ethnic identity, the presence of a charismatic racial/ethnic leader, and whether or not the introducer of a new term for a group is an insider or outsider to the group.

Then there are phenomena like semantic bleaching, whereby terms that enter common use have their meaning diluted and weakened, and concept creep, whereby words change their meaning over long stretches of history by altered patterns of usage.

And finally there are longer-term historical effects like the gradual inundation of language with dark undertones over decades of racism and discriminatory treatment.