# Visualizing Special Relativity

I’ve been thinking a lot about special relativity recently, and wrote up a fun program for visualizing some of its stranger implications. Before going on to these visualizations, I want to recommend the Youtube channel MinutePhysics, which made a fantastic primer on the subject. I’ll link the first few of these here, as they might help with understanding the rest of the post. I highly recommend the entire series, even if you’re already pretty familiar with the subject.

Now, on to the pretty images! I’m still trying to determine whether it’s possible to embed applets in my posts, so that you can play with the program for yourself. Until I figure that out, GIFs will have to suffice.

Let me explain what’s going on in the image.

First of all, the vertical direction is time (up is the future, down is the past), and the horizontal direction is space (which is 1D for simplicity). What we’re looking at is the universe as described by an observer at a particular point in space and time. The point that this observer is at is right smack-dab in the center of the diagram, where the two black diagonal lines meet. These lines represent the observer’s light cone: the paths through spacetime that would be taken by beams of light emitted in either direction. And finally, the multicolored dots scattered in the upper quadrant represent other spacetime events in the observer’s future.

Now, what is being varied is the velocity of the observer. Again, keep in mind that the observer is not actually moving through time in this visualization. What is being shown is the way that other events would be arranged spatially and temporally if the observer had different velocities.

Take a second to reflect on how you would expect this diagram to look classically. Obviously the temporal positions of events would not depend upon your velocity. What about the spatial positions of events? Well, if you move to the right, events in your future and to the right of you should be nearer to you than they would be had you not been in motion. And similarly, events in your future left should be further to the left. We can easily visualize this by plugging in the classical Galilean transformation:

Just as we expected, time positions stay constant and spatial positions shift according to your velocity! Positive velocity (moving to the right) moves future events to the left, and negative velocity moves them to the right. Now, technically this image is wrong. I’ve kept the light paths constant, but even these would shift under the classical transformation. In reality we’d get something like this:

Of course, the empirical falsity of this prediction that the speed of light should vary according to your own velocity is what drove Einstein to formulate special relativity. Here’s what happens with just a few particles when we vary the velocity:

What I love about this is how you can see so many effects in one short gif. First of all, the speed of light stays constant. That’s a good sign! A constant speed of light is pretty much the whole point of special relativity. Secondly, and incredibly bizarrely, the temporal positions of objects depend on your velocity!! Objects to your future right don’t just get further away spatially when you move away from them, they also get further away temporally!

Another thing that you can see in this visualization is the relativity of simultaneity. When the velocity is zero, Red and Blue are at the same moment of time. But if our velocity is greater than zero, Red falls behind Blue in temporal order. And if we travel at a negative velocity (to the left), then we would observe Red as occurring after Blue in time. In fact, you can find a velocity that makes any two of these three points simultaneous!

This leads to the next observation we can make: The temporal order of events is relative! The orderings of events that you can observe include Red-Green-Blue, Green-Red-Blue, Green-Blue-Red, and Blue-Green-Red. See if you can spot them all!

This is probably the most bonkers consequence of special relativity. In general, we cannot say without ambiguity that Event A occurred before or after Event B. The notion of an objective temporal ordering of events simply must be discarded if we are to hold onto the observation of a constant speed of light.

Are there any constraints on the possible temporal orderings of events? Or does special relativity commit us to having to say that from some valid frames of reference, the basketball going through the net preceded the throwing of the ball? Well, notice that above we didn’t get all possible orders… in particular we didn’t have Red-Blue-Green or Blue-Red-Green. It turns out that in general, there are some constraints we can place on temporal orderings.

Just for fun, we can add in the future light cones of each of the three events:

Two things to notice: First, all three events are outside each others’ light cones. And second, no event ever crosses over into another event’s light cone. This makes some intuitive sense, and gives us a constant that will hold true in all reference frames: Events that are outside each others’ light cones from one perspective, are outside each others’ light cones from all perspectives. Same thing for events that are inside each others’ light cones.

Conceptually, events being inside each others’ light cones corresponds to them being in causal contact. So another way we can say this is that all observers will agree on what the possible causal relationships in the universe are. (For the purposes of this post, I’m completely disregarding the craziness that comes up when we consider quantum entanglement and “spooky action at a distance.”)

Now, is it ever possible for events in causal contact to switch temporal order upon a change in reference frame? Or, in other words, could effects precede their causes? Let’s look at a diagram in which one event is contained inside the light cone of another:

Looking at this visualization, it becomes quite obvious that this is just not possible! Blue is fully contained inside the future light cone of Red, and no matter what frame of reference we choose, it cannot escape this. Even though we haven’t formally proved it, I think that the visualization gives the beginnings of an intuition about why this is so. Let’s postulate this as another absolute truth: If Event A is contained within the light cone of Event B, all observers will agree on the temporal order of the two events. Or, in plainer language, there can be no controversy over whether a cause precedes its effects.

I’ll leave you with some pretty visualizations of hundreds of colorful events transforming as you change reference frames:

And finally, let’s trace out the set of possible space-time locations of each event.

Try to guess what geometric shape these paths are! (They’re not parabolas.) Hint.

(Sorry, I know I promised to describe an experiment that would give evidence for different theories of consciousness, but I want to do a quick rant about something else first. The consciousness/anthropics post is coming soon.)

From The Economist: Spare the rod: Spanking makes your children stupid

The article cites “nearly 30 studies from various countries” that “show that children who are regularly spanked become more aggressive themselves” and are “more likely to be depressed or take drugs.” And most relevant to the title of the article, another large study found that “young children in homes with little or no spanking showed swifter cognitive development than their peers.”

Now ask yourself, is it likely that the studies they are citing actually provide evidence for the causal claim they are using to motivate their parenting advice?

To do so, you would need a study involving some type of intervention in parenting behavior (either natural or experimental). That is, you would want a group of researchers who randomly select a group of parents, and tell half of them to beat their kids and the other half not to. Now, do you suspect that these are the types of studies the Economist is citing?? I think not. (I hope not…)

Maybe they got lucky and found a historical circumstance in which there was a natural intervention (one not induced by the experimenters but by some natural phenomenon), and found data about outcomes before and after this intervention. But for this, we’d have to find an occasion where suddenly a random group of parents were forced to stop or start beating their children, while another random group kept at it without changing their behavior. Maybe we could find something like this (like if there were two nearby towns with very similar parenting habits, and one of them suddenly enacted legislation banning corporal punishment), but it seems pretty unlikely. And indeed, if you look at the studies themselves, you find that they are just standard correlational studies.

Maybe they were able to control for all the major confounding variables, and thus get at the real causal relationship? But… really? All the major confounding variables? There are a few ways to do this (like twin studies), but the studies cited don’t do this.

Now, maybe you’re thinking I’m nitpicking. Sure, the studies only find correlations between corporal punishment and outcomes, but isn’t the most reasonable explanation of this that corporal punishment is causing those outcomes?

Judy Rich Harris would disagree. Her famous book The Nurture Assumption looked at the best research attempting to study the causal effects of parenting style on late-life outcomes and found astoundingly little evidence for any at all. That is, when you really are able to measure how much parenting style influences kids’ outcomes down the line, it’s hard to make out any effect in most areas.  And anyway, regardless of the exact strength of the effect of parenting style on later-life outcomes, one thing that’s clear to me is that it is not as strong as we might intuitively suspect. We humans are very good at observing correlation and assuming causation, and can often be surprised at what we find when rigorous causal studies are done.

Plus, it’s not too hard to think of alternative explanations for the observed correlations. To name one, we know that intelligence is heritable, that intelligence is highly correlated with positive life outcomes, and that poor people practice corporal punishment more than rich people. Given these three facts, we’d actually be more surprised if we found no correlation between intelligence and corporal punishment, even if we had no belief that the latter causes the former. And another: aggression is heritable and correlated with general antisocial behavior, which is in turn correlated with negative life outcomes. And I’m sure you can come up with more.

This is not really the hill I want to die on. I agree that corporal punishment is a bad strategy for parenting. But this is not because of a strong belief that in the end spanking leads to depression, drug addiction, and stupidity. I actually suspect that in the long run, spanking is pretty nearly net neutral; the effects probably wash out like most everything else in a person’s childhood. (My prior in this is not that strong and could easily be swayed by seeing actual causal studies that report the opposite.) There’s a much simpler reason to not hit your kids: that it hurts them! Hurting children is bad, so hitting your kids is bad; QED.

Regardless, what does matter to me is good science journalism, especially when it involves giving behavioral advice on the basis of a misleading interpretation of the science. I used to rail against the slogan “correlation does not imply causation”, as, in fact, in some cases correlational data can prove causal claims. But I now have a better sense of why this slogan is so important to promulgate. The cases where correlation proves causation are a tiny subset of the cases where correlation is claimed to prove causation by overenthusiastic science reporters unconcerned with the dangers of misleading their audience. I can’t tell you how often I see pop-science articles making exactly this mistake to very dramatic effect (putting more books in your home will raise your child’s IQ! Climate change is making suicide rates rise!! Eating yogurt causes cancer!!!)

So this is a PSA. Watch out for science reporting that purports to demonstrate causation. Ask yourself how researchers could have established these causal claims, and whether or not it seems plausible that they did so. And read the papers themselves. You might have to work through some irritating academese, but the scientists themselves typically do a good job making disclaimers like NOTICE THAT WE HAVEN’T ACTUALLY DEMONSTRATED A CAUSAL LINK HERE. (These are then often conveniently missed by the journalists reporting on them.)

For example, in the introduction to the paper cited by the Economist the authors write that “we should be very careful about drawing any causal conclusions here, even when there are robust associations. It is very likely that there will be other factors associated with both spanking and child outcomes. If certain omitted variables are correlated with both, we may confound the two effects, that is, inappropriately attribute an effect to spanking. For example, parents who spank their children may be weaker parents overall, and spanking is simply one way in which this difference in parenting quality manifests itself.”

This is a very explicit disclaimer to miss and then to go on writing a headline that gives explicit parenting advice that relies on a causal interpretation of the data!

# Some simple probability puzzles

(Most of these are taken from Ian Hacking’s Introduction to Probability and Inductive Logic.)

1. About as many boys as girls are born in hospitals. Many babies are born every week at City General. In Cornwall, a country town, there is a small hospital where only a few babies are born every week.

Define a normal week as one where between 45% and 55% of babies are female. An unusual week is one where more than 55% or less than 45% are girls.

Which of the following is true:
(a) Unusual weeks occur equally often at City General and at Cornwall.
(b) Unusual weeks are more common at City General than at Cornwall.
(c) Unusual weeks are more common at Cornwall than at City General.

2. Pia is 31 years old, single, outspoken, and smart. She was a philosophy major. When a student, she was an ardent supporter of Native American rights, and she picketed a department store that had no facilities for nursing mothers.

Which of the following statements are most probable? Which are least probable?

(a) Pia is an active feminist.
(b) Pia is a bank teller.
(c) Pia works in a small bookstore.
(d) Pia is a bank teller and an active feminist.
(e) Pia is a bank teller and an active feminist who takes yoga classes.
(f) Pia works in a small bookstore and is an active feminist who takes yoga classes.

3. You have been called to jury duty in a town with only green and blue taxis. Green taxis dominate the market, with 85% of the taxis on the road.

On a misty winter night a taxi sideswiped another car and drove off. A witness said it was a blue cab. This witness is tested under similar conditions, and gets the color right 80% of the time.

You conclude about the sideswiping taxi:
(a) The probability that it is blue is 80%.
(b) It is probably blue, but with a lower probability than 80%.
(c) It is equally likely to be blue or green.
(d) It is more likely than not to be green.

4. You are a physician. You think that it’s quite likely that a patient of yours has strep throat. You take five swabs from the throat of this patient and send them to a lab for testing.

If the patient has strep throat, the lab results are right 70% of the time. If not, then the lab is right 90% of the time.

The test results come back: YES, NO, NO, YES, NO

You conclude:
(a) The results are worthless.
(b) It is likely that the patient does not have strep throat.
(c) It is slightly more likely than not that the patient does have strep throat.
(d) It is very much more likely than not that the patient does have strep throat.

5. In a country, all families wants a boy. They keep having babies till a boy is born. What is the expected ratio of boys and girls in the country?
6.  Answer the following series of questions:

If you flip a fair coin twice, do you have the same chance of getting HH as you have of getting HT?

If you flip the coin repeatedly until you get HH, does this result in the same average number of flips as if you repeat until you get HT?

If you flip it repeatedly until either HH emerges or HT emerges, is either outcome equally likely?

You play a game with a friend in which you each choose a sequence of three possible flips (e.g HHT and TTH). You then flip the coin repeatedly until one of the two patterns emerges, and whosever pattern it is wins the game. You get to see your friend’s choice of pattern before deciding yours. Are you ever able to bias the game in your favor?

Are you always able to bias the game in your favor?

## Solutions (and lessons)

1. The correct answer is (a): Unusual weeks occur more often at Cornwall than at City General. Even though the chance of a boy is the same at Cornwall as it is at City General, the percentage of boys from week to week is larger in the smaller city (for N patients a week, the percentage boys goes like 1/sqrt(N)). Indeed, if you think about an extreme case where Cornwall has only one birth a week, then every week will be an unusual week (100% boys or 0% boys).
2. There is room to debate the exact answer but whatever it is, it has to obey some constraints. Namely, the most probable statement cannot be (d), (e), or (f), and the least probable statement cannot be (a), (b), or (c). Why? Because of the conjunction rule of probability: each of (d), (e), and (f) are conjunctions of (a), (b), and (c), so they cannot be more likely. P(A & B) ≤ P(A).

It turns out that most people violate this constraint. Many people answer that (f) is the most probable description, and (b) is the least probable. This result is commonly interpreted to reveal a cognitive bias known as the representativeness heuristic – essentially, that our judgements of likelihood are done by considering which descriptions most closely resemble the known facts. In this case,

Another factor to consider is that prior to considering the evidence, your odds on a given person being a bank teller as opposed to working in a small bookstore should be heavily weighted towards her being a bank teller. There are just far more bank tellers than small bookstore workers (maybe a factor of around 20:1). This does not necessarily mean that (b) is more likely than (c), but it does mean that the evidence must discriminate strongly enough against her being a bank teller so as to overcome the prior odds.

This leads us to another lesson, which is to not neglect the base rate. It is easy to ignore the prior odds when it feels like we have strong evidence (Pia’s age, her personality, her major, etc.). But the base rate on small bookstore workers and bank tellers are very relevant to the final judgement.

3. The correct answer is (d) – it is more likely than not that the sideswiper was green. This is a basic case of base rate neglect – many people would see that the witness is right 80% of the time and conclude that the witness’s testimony has an 80% chance of being correct. But this is ignoring the prior odds on the content of the witness’s testimony.

In this case, there were prior odds of 17:3 (85%:15%) in favor of the taxi being green. The evidence had a strength of 1:4 (20%:80%), resulting in the final odds being 17:12 in favor of the taxi being green. Translating from odds to probabilities, we get a roughly 59% chance of the taxi having been green.

We could have concluded (d) very simply by just comparing the prior probability (85% for green) with the evidence (80% for blue), and noticing that the evidence would not be strong enough to make blue more likely than green (since 85% > 80%). Being able to very quickly translate between statistics and conclusions is a valuable skill to foster.

4. The right answer is (d). We calculate this just like we did the last time:

The results were YES, NO, NO, YES, NO.

Each YES provides evidence with strength 7:1 (70%/10%) in favor of strep, and each NO provides evidence with strength 1:3 (30%/90%).

So our strength of evidence is 7:1 ⋅ 1:3 ⋅ 1:3 ⋅ 7:1 ⋅ 1:3 = 49:27, or roughly 1.81:1 in favor of strep. This might be a little surprising… we got more NOs than YESs and the NO was correct 90% of the time for people without strep, compared to the YES being correct only 70% of the time in people with strep.

Since the evidence is in favor of strep, and we started out already thinking that strep was quite likely, in the end we should be very convinced that they have strep. If our prior on the patient having strep was 75% (3:1 odds), then our probability after getting evidence will be 84% (49:9 odds).

Again, surprising! The patient who sees these results and hears the doctor declaring that the test strengthens their belief that the patient has strep might feel that this is irrational and object to the conclusion. But the doctor would be right!

5. Supposing as before that the chance of any given birth being a boy is equal to the chance of it being a girl, we end up concluding…

The expected ratio of boys and girls in the country is 1! That is, this strategy doesn’t allow you to “cheat” – it has no impact at all on the ratio. Why? I’ll leave this one for you to figure out. Here’s a diagram for a hint:

This is important because it applies to the problem of p-hacking. Imagine that all researchers just repeatedly do studies until they get the results they like, and only publish these results. Now suppose that all the researchers in the world are required to publish every study that they do. Now, can they still get a bias in favor of results they like? No! Even though they always stop when getting the result they like, the aggregate of their studies is unbiased evidence. They can’t game the system!

If you flip a fair coin twice, do you have the same chance of getting HH as you have of getting HT? (Yes)

If you flip it repeatedly until you get HH, does this result in the same average number of flips as if you repeat until you get HT? (No)

If you flip it repeatedly until either HH emerges or HT emerges, is either outcome equally likely? (Yes)

You play a game with a friend in which you each choose a sequence of three coin flips (e.g HHT and TTH). You then flip a coin repeatedly until one of the two patterns emerges, and whosever pattern it is wins the game. You get to see your friend’s choice of pattern before deciding yours. Are you ever able to bias the game in your favor? (Yes)

Are you always able to bias the game in your favor? (Yes!)

Here’s a wiki page with a good explanation of this: LINK. A table from that page illustrating a winning strategy for any choice your friend makes:

1st player’s choice 2nd player’s choice Odds in favour of 2nd player
HHH THH 7 to 1
HHT THH 3 to 1
HTH HHT 2 to 1
HTT HHT 2 to 1
THH TTH 2 to 1
THT TTH 2 to 1
TTH HTT 3 to 1
TTT HTT 7 to 1

# Explanation is asymmetric

We all regularly reason in terms of the concept of explanation, but rarely think hard about what exactly we mean by this explanation. What constitutes a scientific explanation? In this post, I’ll point out some features of explanation that may not be immediately obvious.

Let’s start with one account of explanation that should seem intuitively plausible. This is the idea that to explain X to a person is to give that person some information I that would have allowed them to predict X.

For instance, suppose that Janae wants an explanation of why Ari is not pregnant. Once we tell Janae that Ari is a biological male, she is satisfied and feels that the lack of pregnancy has been explained. Why? Well, because had Janae known that Ari was a male, she would have been able to predict that Ari would not get pregnant.

Let’s call this the “predictive theory of explanation.” On this view, explanation and prediction go hand-in-hand. When somebody learns a fact that explains a phenomenon, they have also learned a fact that allows them to predict that phenomenon.

To spell this out very explicitly, suppose that Janae’s state of knowledge at some initial time is expressed by

K1 = “Males cannot get pregnant.”

At this point, Janae clearly cannot conclude anything about whether Ari is pregnant. But now Janae learns a new piece of information, and her state of knowledge is updated to

K2 = “Ari is a male & males cannot get pregnant.”

Now Janae is warranted in adding the deduction

K’ = “Ari cannot get pregnant”

This suggests that added information explains Ari’s non-pregnancy for the same reason that it allows the deduction of Ari’s non-pregnancy.

Now, let’s consider a problem with this view: the problem of relevance.

Suppose a man named John is not pregnant, and somebody explains this with the following two premises:

1. People who take birth control pills almost certainly don’t get pregnant.
2. John takes birth control pills regularly.

Now, these two premises do successfully predict that John will not get pregnant. But the fact that John takes birth control pills regularly gives no explanation at all of his lack of pregnancy. Naively applying the predictive theory of explanation gives the wrong answer here.

You might have also been suspicious of the predictive theory of explanation on the grounds that it relied on purely logical deduction and a binary conception of knowledge, not allowing us to accommodate the uncertainty inherent in scientific reasoning. We can fix this by saying something like the following:

What it is to explain X to somebody that knows K is to give them information I such that

(1) P(X | K) is small, and
(2) P(X | K, I) is large.

“Small” and “large’ here are intentionally vague; it wouldn’t make sense to draw a precise line in the probabilities.

The idea here is that explanations are good insofar as they (1) make their explanandum sufficiently likely, where (2) it would be insufficiently likely without them.

We can think of this as a correlational account of explanation. It attempts to root explanations in sufficiently strong correlations.

First of all, we can notice that this doesn’t suffer from a problem with irrelevant information. We can find relevance relationships by looking for independencies between variables. So maybe this is a good definition of scientific explanation?

Unfortunately, this “correlational account of explanation” has its own problems.

Take the following example.

This flagpole casts a shadow of length L because of the angle of elevation of the sun and the height of the flagpole (H). In other words, we can explain the length of the shadow with the following pieces of information:

I1 =  “The angle of elevation of the sun is θ”
I2 = “The height of the lamp post is H”
I3 = Details involving the rectilinear propagation of light and the formation of shadows

Both the predictive and correlational theory of explanation work fine here. If somebody wanted an explanation for why the shadow’s length is L, then telling them I1, I2, and I3 would suffice. Why? Because I1, I2, and Ijointly allow us to predict the shadow’s length! Easy.

X = “The length of the shadow is L.”
(I1 & I2 & I3) ⇒ X
So I1 & I2 & I3 explain X.

And similarly, P(X | I1 & I2 & I3) is large, and P(X) is small. So on the correlational account, the information given explains X.

But now, consider the following argument:

(I1 & I3 & X) ⇒ I2
So I1 & I3 & X explain I2.

The predictive theory of explanation applies here. If we know the length of the shadow and the angle of elevation of the sun, we can deduce the height of the flagpole. And the correlational account tells us the same thing.

But it’s clearly wrong to say that the explanation for the height of the flagpole is the length of the shadow!

What this reveals is an asymmetry in our notion of explanation. If somebody already knows how light propagates and also knows θ, then telling them H explains L. But telling them L does not explain H!

In other words, the correlational theory of explanation fails, because correlation possesses symmetry properties that explanation does not.

This thought experiment also points the way to a more complete account of explanation. Namely, the relevant asymmetry between the length of the shadow and the height of the flagpole is one of causality. The reason why the height of the flagpole explains the shadow length but not vice versa, is that the flagpole is the cause of the shadow and not the reverse.

In other words, what this reveals to us is that scientific explanation is fundamentally about finding causes, not merely prediction or statistical correlation. This causal theory of explanation can be summarized in the following:

An explanation of A is a description of its causes that renders it intelligible.

More explicitly, an explanation of A (relative to background knowledge K) is a set of causes of A that render X intelligible to a rational agent that knows K.

# Priors in the supernatural

A friend of mine recently told me the following anecdote.

Years back, she had visited an astrologer in India with her boyfriend, who told her the following things: (1) she would end up marrying her boyfriend at the time, (2) down the line they would have two kids, the first a girl and the second a boy, and (3) he predicted the exact dates of birth of both children.

Many years down the line, all of these predictions turned out to be true.

I trust this friend a great deal, and don’t have any reason to think that she misremembered the details or lied to me about them. But at the same time, I recognize that astrology is completely crazy.

Since that conversation, I’ve been thinking about the ways in which we can evaluate our de facto priors in supernatural events by consulting either real-world anecdotes or thought experiments. For instance, if we think that each of these two predictions gave us a likelihood ratio of 100:1 in favor of astrology being true, and if I ended up thinking that astrology was about as likely to be true as false, then I must have started with roughly 1:10,000 odds against astrology being true.

That’s not crazily low for a belief that contradicts much of our understanding of physics. I would have thought that my prior odds would be something much lower, like 1:1010 or something. But really put yourself in that situation.

Imagine that you go to an astrologer, who is able to predict an essentially unpredictable sequence of events years down the line, with incredible accuracy. Suppose that the astrologer tells you who you will marry, how many kids you’ll have, and the dates of birth of each. Would you really be totally unshaken by this experience? Would you really believe that it was more likely to have happened by coincidence?

Yes, yes, I know the official Bayesian response – I read it in Jaynes long ago. For beliefs like astrology that contradict our basic understanding of science and causality, we should always have reserved some amount of credence for alternate explanations, even if we can’t think of any on the spot. This reserve of credence will insure us against jumping in credence to 99% upon seeing a psychic continuously predict the number in your heads, ensuring sanity and a nice simple secular worldview.

But that response is not sufficient to rule out all strong evidence for the supernatural.

Here’s one such category of strong evidence: evidence for which all alternative explanations are ruled out by the laws of physics as strongly as the supernatural hypothesis is ruled out by the laws of physics.

I think that my anecdote is one such case. If it was true, then there is no good natural alternative explanation for it. The reason? Because the information about the dates of birth of my friend’s children did not exist in the world at the time of the prediction, in any way that could be naturally attainable by any human being.

By contrast, imagine you go to a psychic who tells you to put up some fingers behind your back and then predicts over and over again how many fingers you have up. There’s hundreds of alternative explanations for this besides “Psychics are real science has failed us.” The reason that there are these alternative explanations is that the information predicted by the psychic existed in the world at the time of the prediction.

But in the case of my friend’s anecdote, the information predicted by the astrologer was lost far in the chaotic dynamics of the future.

What this rules out is the possibility that the astrologer somehow obtained the information surreptitiously by any natural means. It doesn’t rule out a host of other explanations, such as that my friend’s perception at the time was mistaken, that her memory of the event is skewed, or that she is lying. I could even, as a last resort, consider that possibility that I hallucinated the entire conversation with her. (I’d like to give the formal title “unbelievable propositions” to the set of propositions that are so unlikely that we should sooner believe that we are hallucinating than accept evidence for them.)

But each of these sources of alternative explanations, with the possible exception of the last, can be made significantly less plausible.

Let me use a thought experiment to illustrate this.

Imagine that you are a nuclear physicist who, with a group of fellow colleagues, have decided to test the predictive powers of a fortune teller. You carefully design an experiment in which a source of true quantum randomness will produce a number between 1 and N. Before the number has been produced, when it still exists only as an unrealized possibility in the wave function, you ask the fortune teller to predict its value.

Suppose that they get it correct. For what value of N would you begin to take their fortune telling abilities seriously?

Here’s how I would react to the success, for different values of N.

N = 10: “Haha, that’s a funny coincidence.”

N = 100: “Hm, that’s pretty weird.”

N = 1000: “What…”

N = 10,000: “Wait, WHAT!?”

N = 100,000: “How on Earth?? This is crazy.”

N = 1,000,000: “Ok, I’m completely baffled.”

I think I’d start taking them seriously as early as N = 10,000. This indicates prior odds of roughly 1:10,000 against fortune-telling abilities (roughly the same as my prior odds against astrology, interestingly!). Once again, this seems disconcertingly low.

But let’s try to imagine some alternative explanations.

As far as I can tell, there are only three potential failure points: (1) our understanding of physics, (2) our engineering of the experiment, (3) our perception of the fortune teller’s prediction.

First of all, if our understanding of quantum mechanics is correct, there is no possible way that any agent could do better than random at predicting the number.

Secondly, we stipulated that the experiment was designed meticulously so as to ensure that the information was truly random, and unavailable to the fortune-teller. I don’t think that such an experiment would actually be that hard to design. But let’s go even further and imagine that we’ve designed the experiment so that the fortune teller is not in causal contact with the quantum number-generator until after she has made her prediction.

And thirdly, we can suppose that the prediction is viewed by multiple different people, all of whom affirm that it was correct. We can even go further and imagine that video was taken, and broadcast to millions of viewers, all of whom agreed. Not all of them could just be getting it wrong over and over again. The only possibility is that we’re hallucinating not just the experimental result, but indeed also the public reaction and consensus on the experimental result.

But the hypothesis of a hallucination now becomes inconsistent with our understanding of how the brain works! A hallucination wouldn’t have the effect of creating a perception of a completely coherent reality in which everybody behaves exactly as normal except that they saw the fortune teller make a correct prediction. We’d expect that if this were a hallucination, it would not be so self-consistent.

Pretty much all that’s left, as far as I can tell, is some sort of Cartesian evil demon that’s cleverly messing with our brains to create this bizarre false reality. If this is right, then we’re left weighing the credibility of the laws of physics against the credibility of radical skepticism. And in that battle, I think, the laws of physics lose out. (Consider that the invalidity of radical skepticism is a precondition for the development of laws of physics in the first place.)

The point of all of this is just to sketch an example where I think we’d have a good justification for ruling out all alternative explanations, at least with an equivalent degree of confidence that we have for affirming any of our scientific knowledge.

Let’s bring this all the way back to where we started, with astrology. The conclusion of this blog post is not that I’m now a believer in astrology. I think that there’s enough credence in the buckets of “my friend misremembered details”, “my friend misreported details”, and “I misunderstood details” so that the likelihood ratio I’m faced with is not actually 10,000 to 1. I’d guess it’s something more like 10 to 1.

But I am now that much less confident that astrology is wrong. And I can imagine circumstances under which my confidence would be drastically decreased. While I don’t expect such circumstances to occur, I do find it instructive (and fun!) to think about them. It’s a good test of your epistemology to wonder what it would take for your most deeply-held beliefs to be overturned.

# Constructing the world

In this six and a half hour lecture series by David Chalmers, he describes the concept of a minimal set of statements from which all other truths are a priori “scrutable” (meaning, basically, in-principle knowable or derivable).

What are the types of statements in this minimal set required to construct the world? Chalmers offers up four categories, and abbreviates this theory PQIT.

# P

P is the set of physical facts (for instance, everything that would be accessible to a Laplacean demon). It can be thought of as essentially the initial conditions of the universe and the laws governing their changes over time.

# Q

Q is the set of facts about qualitative experience. We can see Chalmers’ rejection of physicalism here, as he doesn’t think that Q is eclipsed within P. Example of a type of statement that cannot be derived from P without Q: “There is a beige region in the bottom right of my visual field.”

# I

Here’s a true statement: “I’m in the United States.” Could this be derivable from P and Q? Presumably not; we need another set of indexical truths that allows us to have “self-locating” beliefs and to engage in anthropic reasoning.

# T

Suppose that P, Q, and I really are able to capture all the true statements there are to be captured. Well then, the statement “P, Q, and I really are able to capture all the true statements there are to be captured” is a true statement, and it is presumably not captured by P, Q, and I! In other words, we need some final negative statements that tell us that what we have is enough, and that there are no more truths out there. These “that’s all”-type statements are put into the set T.

⁂⁂⁂

So this is a basic sketch of Chalmer’s construction. I like that we can use these tags like PQIT or PT or QIT as a sort of philosophical zip-code indicating the core features of a person’s philosophical worldview. I also want to think about developing this further. What other possible types of statements are there out there that may not be captured in PQIT? Here is a suggestion for a more complete taxonomy:

p    microphysics
P    macrophysics (by which I mean all of science besides fundamental physics)
Q    consciousness
R    normative rationality
E
normative ethics
C    counterfactuals
L
mathematical / logical truths
I     indexicals
T    “that’s-all” statements

I’ve split P into big-P (macrophysics) and little-p (microphysics) to account for the disagreements about emergence and reductionism. Normativity here is broad enough to include both normative epistemic statements (e.g. “You should increase your credence in the next coin toss landing H after observing it land H one hundred times in a row”) and ethical statements. The others are fairly self-explanatory.

The most ontologically extravagant philosophical worldview would then be characterized as pPQRECLIT.

My philosophical address is pRLIT (with the addendum that I think C comes from p, and am really confused about Q). What’s yours?

# Moving Naturalism Forward: Eliminating the macroscopic

Sean Carroll, one of my favorite physicists and armchair philosophers, hosted a fantastic conference on philosophical naturalism and science, and did the world a great favor by recording the whole thing and posting it online. It was a three-day long discussion on topics like the nature of reality, emergence, morality, free will, meaning, and consciousness. Here are the videos for the first two discussion sections, and the rest can be found by following Youtube links.

Having watched through the entire thing, I have updated a few of my beliefs, plan to rework some of my conceptual schema, and am puzzled about a few things.

A few of my reflections and take-aways:

1. I am much more convinced than before that there is a good case to be made for compatibilism about free will.
2. I think there is a set of interesting and challenging issues around the concept of representation and intentionality (about-ness) that I need to look into.
3. I am more comfortable with intense reductionism claims, like “All fact about the macroscopic world are entailed by the fundamental laws of physics.”
4. I am really interested in hearing Dan Dennett talk more about grounding morality, because what he said was starting to make a lot of sense to me.
5. I am confused about the majority attitude in the room that there’s not any really serious reason to take an eliminativist stance about macroscopic objects.
6. I want to find more details about the argument that Simon DeDeo was making for the undecidability of questions about the relationship between macroscopic theories and microscopic theories (!!!).
7. There’s a good way to express the distinction between the type of design human architects engage in and the type of design that natural selection produces, which is about foresight and representations of reasons. I’m not going to say more about this, and will just refer you to the videos.
8. There are reasons to suspect that animal intelligence and capacity to suffer are inversely correlated (that is, the more intelligent an animal, the less capacity to suffer it likely has). This really flips some of our moral judgements on their head. (You must deliver a painful electric shock to either a human or to a bird. Which one will you choose?)

Let me say a little more about number 5.

I think that questions about whether macroscopic objects like chairs or plants really REALLY exist, or whether there are really only just fermions and bosons are ultimately just questions about how we should use the word “exist.” In the language of our common sense intuitions, obviously chairs exist, and if you claim otherwise, you’re just playing complicated semantic games. I get this argument, and I don’t want to be that person that clings to bizarre philosophical theses that rest on a strange choice of definitions.

But at the same time, I see a deep problem with relying on our commonsense intuitions about the existence of the macro world. This is that as soon as we start optimizing for consistency, even a teeny tiny bit, these macroscopic concepts fall to pieces.

For example, here is a trilemma (three statements that can’t all be correct):

1. The thing I am sitting on is a chair.
2. If you subtract a single atom from a chair, it is still a chair.
3. Empty space is not a chair.

These seem to me to be some of the most obvious things we could say about chairs. And yet they are subtly incoherent!

Number 1 is really shorthand for something like “there are chairs.” And the reason why the second premise is correct is that denying it requires that there be a chair such that if you remove a single atom, it is no longer a chair. I take it to be obvious that such things don’t exist. But accepting the first two requires us to admit that as we keep shedding atoms from a chair, it stays a chair, even down to the very last atom. (By the way, some philosophers do actually deny number 2. They take a stance called epistemicism, which says that concepts like “chair” and “heap” are actually precise and unambiguous, and there exists a precise point at which a chair becomes a non-chair. This is the type of thing that makes me giggle nervously when reflecting on the adequacy of philosophy as a field.)

As I’ve pointed out in the past, these kinds of arguments can be applied to basically everything in the macroscopic world. They wreak havoc on our common sense intuitions and, to my mind, demand rejection of the entire macroscopic world. And of course, they don’t apply to the microscopic world. “If X is an electron, and you change its electric charge a tiny bit, is it still an electron?” No! Electrons are physical substances with precise and well-defined properties, and if something doesn’t have these properties, it is not an electron! So the Standard Model is safe from this class of arguments.

Anyway, this is all just to make the case that upon close examination, our commonsense intuitions about the macroscopic world turn out to be subtly incoherent. What this means is that we can’t make true statements like “There are two cars in the garage”. Why? Just start removing atoms from the cars until you get to a completely empty garage. Since no single-atom change can make the relevant difference to “car-ness”, at each stage, you’ll still have two cars!

As soon as you start taking these macroscopic concepts seriously, you find yourself stuck in a ditch. This, to me, is an incredibly powerful argument for eliminativism, and I was surprised to find that arguments like these weren’t stressed at the conference. This makes me wonder if this argument is as powerful as I think.

I’m starting to get a sense of why people like David Chalmers and Daniel Dennett call consciousness the most mysterious thing known to humans. I’m currently just really confused, and think that pretty much every position available with respect to consciousness is deeply unsatisfactory. In this post, I’ll just walk through my recent thinking.

## Against physicalism

In a previous post, I imagined a scientist from the future who told you they had a perfected theory of consciousness, and asked how we could ask for evidence confirming this. This theory of consciousness could presumably be thought of as a complete mapping from physical states to conscious states – a set of psychophysical laws. Questions about the nature of consciousness are then questions about the nature of these laws. Are they ultimately the same kind of laws as chemical laws (derivable in principle from the underlying physics)? Or are they logically distinct laws that must be separately listed on the catalogue of the fundamental facts about the universe?

I take physicalism to be the stance that answers ‘yes’ to the first question and ‘no’ to the second. Dualism and epiphenomenalism answer ‘no’ to the first and ‘yes’ to the second, and are distinguished by the character of the causal relationships between the physical and the conscious entailed by the psychophysical laws.

So, is physicalism right? Imagining that we had a perfect mapping from physical states to conscious states, would this mapping be in principle derivable from the Schrodinger equation? I think the answer to this has to be no; whatever the psychophysical laws are, they are not going to be in principle derivable from physics.

To see why, let’s examine what it looks like when we derive macroscopic laws from microscopic laws. Luckily, we have a few case studies of successful reduction. For instance, you can start with just the Schrodinger equation and derive the structure of the periodic table. In other words, the structure and functioning of atoms and molecules naturally pops out when you solve the equation for systems of many particles.

You can extrapolate this further to larger scale systems. When we solve the Schrodinger equation for large systems of biomolecules, we get things like enzymes and cell membranes and RNA, and all of the structure and functioning corresponding to our laws of biology. And extending this further, we should expect that all of our behavior and talk about consciousness will be ultimately fully accounted for in terms of purely physical facts about the structure of our brain.

The problem is that consciousness is something more than just the words we say when talking about consciousness. While it’s correlated in very particular ways with our behavior (the structure and functioning of our bodies), it is by its very nature logically distinct from these. You can tell me all about the structure and functioning of a physical system, but the question of whether or not it is conscious is a further fact that is not logically entailed. The phrase LOGICALLY entailed is very important here – it may be that as a matter of fact, it is a contingent truth of our universe that conscious facts always correspond to specific physical facts. But this is certainly not a relationship of logical entailment, in the sense that the periodic table is logically entailed by quantum mechanics.

In summary, it looks like we have a problem on our hands if we want to try to derive facts about consciousness from facts about fundamental physics. Namely, the types of things we can derive from something like the Schrodinger equation are facts about complex macroscopic structure and functioning. This is all well and good for deriving chemistry or solid-state physics from quantum mechanics, as these fields are just collections of facts about structure and functioning. But consciousness is an intrinsic property that is logically distinct from properties like macroscopic structure and functioning. You simply cannot expect to start with the Schrodinger equation and naturally arrive at statements like “X is experiencing red” or “Y is feeling sad”, since these are not purely behavioral statements.

Here’s a concise rephrasing of the argument I’ve made, in terms of a trilemma. Out of the following three postulates, you cannot consistently accept all three:

1. There are facts about consciousness.
2. Facts about consciousness are not logically entailed by the Schrodinger equation (substitute in whatever the fundamental laws of physics end up being).

Denying (1) makes you an eliminativist. Presumably this is out of the question; consciousness is the only thing in the universe that we can know with certainty exists, as it is the only thing that we have direct first-person access to. Indeed, all the rest of our knowledge comes to us by means of our conscious experience, making it in some sense the root of all of our knowledge. The only charitable interpretations I have of eliminativism involve semantic arguments subtly redefining what we mean by “consciousness” away from “that thing which we all know exists from first-hand experience” to something whose existence can actually be cast doubt on.

Denying (2) seems really implausible to me for the considerations given above.

So denying (3) looks like our only way out.

Okay, so let’s suppose physicalism is wrong. This is already super important. If we accept this argument, then we have a worldview in which consciousness is of fundamental importance to the nature of reality. The list of fundamental facts about the universe will be (1) the laws of physics and (2) the laws of consciousness. This is really surprising for anybody like me that professes a secular worldview that places human beings far from the center of importance in the universe.

But “what about naturalism?” is not the only objection to this position. There’s a much more powerful argument.

## Against non-physicalism

Suppose we now think that the fundamental facts about the universe fall into two categories: P (the fundamental laws of physics, plus the initial conditions of the universe) and Q (the facts about consciousness). We’ve already denied that P = Q or that there is a logical entailment relationship from P to Q.

Now we can ask about the causal nature of the psychophysical laws. Does P cause Q? Does Q cause P? Does the causation go both ways?

First, conditional on the falsity of physicalism, we can quickly rule out theories that claim that Q causes P (i.e. dualist theories). This is the old Cartesian picture that is unsatisfactory exactly because of the strength of the physical laws we’ve discovered. In short, physics appears to be causally complete. If you fix the structure and functioning on the microscopic level, then you fix the structure and functioning on the macroscopic level. In the language of philosophy, macroscopic physical facts supervene upon microscopic physical facts.

But now we have a problem. If all of our behavior and functioning is fully causally accounted for by physical facts, then what is there for Q (consciousness) to play a causal role in? Precisely nothing!

We can phrase this in the following trilemma (again, all three of these cannot be simultaneously true):

1. Physicalism is false.
2. Physics is causally closed.
3. Consciousness has a causal influence on the physical world.

Okay, so now we have ruled out any theories in which Q causes P. But now we reach a new and even more damning conclusion. Namely, if facts about consciousness have literally no causal influence on any aspect of the physical world, then they have no causal influence, in particular, on your thoughts and beliefs about your consciousness.

Stop to consider for a moment the implications of this. We take for granted that we are able to form accurate beliefs about our own conscious experiences. When we are experiencing red, we are able to reliably produce accurate beliefs of the form “I am experiencing red.” But if the causal relationship goes from P to Q, then this becomes extremely hard to account for.

What would we expect to happen if our self-reports of our consciousness fell out of line with our actual consciousness? Suppose that you suddenly noticed yourself verbalizing “I’m really having a great time!” when you actually felt like you were in deep pain and discomfort. Presumably the immediate response you would have would be confusion, dismay, and horror. But wait! All of these experiences must be encoded in your brain state! In other words, to experience horror at the misalignment of your reports about your consciousness and your actual consciousness, it would have to be the case that your physical brain state would change in a particular way. And a necessary component of the explanation for this change would be the actual state of your consciousness!

This really gets to the heart of the weirdness of epiphenomenalism (the view that P causes Q, but Q doesn’t causally influence P). If you’re an epiphenomenalist, then all of your beliefs and speculations about consciousness are formed exactly as they would be if your conscious state were totally different. The exact same physical state of you thinking “Hey, this coffee cake tastes delicious!” would arise even if the coffee cake actually tasted like absolute shit.

This is completely absurd. On the epiphenomenalist view, any correlation between the beliefs you form about consciousness and the actual facts about your conscious state couldn’t possibly be explained by the actual facts about your consciousness. So they must be purely coincidental.

In other words, the following two statements cannot be simultaneously accepted:

• Consciousness does not causally influence our behavior.
• Our beliefs about our conscious states are more accurate than random guessing.

## So where does that leave us?

It leaves us in a very uncomfortable place. First of all, we should deny physicalism. But the denial of physicalism leaves us with two choices: either Q causes P or it does not.

We should deny the first, because otherwise we are accepting the causal incompleteness of physics.

And we should deny the second, because it leads us to conclude that essentially all of our beliefs about our conscious experiences are almost certainly wrong, undermining all of our reasoning that led us here in the first place.

So here’s a summary of this entire post so far. It appears that the following four statements cannot all be simultaneously true. You must pick at least one to reject.

1. There are facts about consciousness.
2. Facts about consciousness are not logically entailed by the Schrodinger equation (substitute in whatever the fundamental laws of physics end up being).
3. Physics is causally closed.
4. Our beliefs about our conscious states are more accurate than random guessing.

Eliminativists deny (1).

Physicalists deny (2).

Dualists deny (3).

And epiphenomenalists must deny (4).

I find that the easiest to deny of these four is (2). This makes me a physicalist, but not because I think that physicalism is such a great philosophical position that everybody should hold. I’m a physicalist because it seems like the least horrible of all the horrible positions available to me.

## Counters and counters to those counters

A response that I would have once given when confronted by these issues would be along the lines of: “Look, clearly consciousness is just a super confusing topic. Most likely, we’re just thinking wrong about the whole issue and shouldn’t be taking the notion of consciousness so seriously.”

Part of this is right. Namely, consciousness is a super confusing topic. But it’s important to clearly delineate between which parts of consciousness are confusing and which parts are not. I’m super confused about how to make sense of the existence of consciousness, how to fit consciousness into my model of reality, and how to formalize my intuitions about the nature of consciousness. But I’m definitively not confused about the existence of consciousness itself. Clearly consciousness, in the sense of direct first-person experience, exists, and is a property that I have. The confusion arises when we try to interpret this phenomenon.

In addition, “X is super confusing” might be a true statement and a useful acknowledgment, but it doesn’t necessarily push us in one direction over another when considering alternative viewpoints on X. So “X is super confusing” isn’t evidence for “We should be eliminativists about X” over “We should be realists about X.” All it does is suggest that something about our model of reality needs fixing, without pointing to which particular component it is that needs fixing.

One more type of argument that I’ve heard (and maybe made in the past, to my shame) is a “scientific optimism” style of argument. It goes:

Look, science is always confronted with seemingly unsolvable mysteries.  Brilliant scientists in each generation throw their hands up in bewilderment and proclaim the eternal unsolvability of the deep mystery of their time. But then a few generations later, scientists end up finding a solution, and putting to shame all those past scientists that doubted the power of their art.

Consciousness is just this generation’s “great mystery.” Those that proclaim that science can never explain the conscious in terms of the physical are wrong, just as Lord Kelvin was wrong when he affirmed that the behavior of living organisms cannot be explained in terms of purely physical forces, and required a mysterious extra element (the ‘vital principle’ as he termed it).

I think that as a general heuristic, “Science is super powerful and we should be cautious before proclaiming the existence of specific limits on the potential of scientific inquiry” is pretty damn good.

But at the same time, I think that there are genuinely good reasons, reasons that science skeptics in the past didn’t have, for affirming the uniqueness of consciousness in this regard.

Lord Kelvin was claiming that there were physical behaviors that could not be explained by appeal to purely physical forces. This is a very different claim from the claim that there are phenomenon that are not purely logically reducible to structural properties of matter, that cannot be explained by purely physical forces. This, it seems to me, is extremely significant, and gets straight to the crux of the central mystery of consciousness.

# The problem with philosophy

(Epistemic status: I have a high credence that I’m going to disagree with large parts of this in the future, but it all seems right to me at present. I know that’s non-Bayesian, but it’s still true.)

Philosophy is great. Some of the clearest thinkers and most rational people I know come out of philosophy, and many of my biggest worldview-changing moments have come directly from philosophers. So why is it that so many scientists seem to feel contempt towards philosophers and condescension towards their intellectual domain? I can actually occasionally relate to the irritation, and I think I understand where some of it comes from.

Every so often, a domain of thought within philosophy breaks off from the rest of philosophy and enters the sciences. Usually when this occurs, the subfield (which had previously been stagnant and unsuccessful in its attempts to make progress) is swiftly revolutionized and most of the previous problems in the field are promptly solved.

Unfortunately, what also often happens is that the philosophers that were previously working in the field are often unaware of or ignore the change in their field, and end up wasting a lot of time and looking pretty silly. Sometimes they even explicitly challenge the scientists at the forefront of this revolution, like Henri Bergson did with Einstein after he came out with his pesky new theory of time that swept away much of the past work of philosophers in one fell swoop.

Next you get a generation of philosophy students that are taught a bunch of obsolete theories, and they are later blindsided when they encounter scientists that inform them that the problems they’re working on have been solved decades ago. And by this point the scientists have left the philosophers so far in the dust that the typical philosophy student is incapable of understanding the answers to their questions without learning a whole new area of math or something. Thus usually the philosophers just keep on their merry way, asking each other increasingly abstruse questions and working harder and harder to justify their own intellectual efforts. Meanwhile scientists move further and further beyond them, occasionally dropping in to laugh at their colleagues that are stuck back in the Middle Ages.

Part of why this happens is structural. Philosophy is the womb inside which develops the seeds of great revolutions of knowledge. It is where ideas germinate and turn from vague intuitions and hotbeds of conceptual confusion into precisely answerable questions. And once these questions are answerable, the scientists and mathematicians sweep in and make short work of them, finishing the job that philosophy started.

I think that one area in which this has happened is causality.

Statisticians now know how to model causal relationships, how to distinguish them from mere regularities, how to deal with common causes and causal pre-emption, how to assess counterfactuals and assign precise probabilities to these statements, and how to compare different causal models and determine which is most likely to be true.

(By the way, guess where I came to be aware of all of this? It wasn’t in the metaphysics class in which we spent over a month discussing the philosophy of causation. No, it was a statistician friend of mine who showed me a book by Judea Pearl and encouraged me to get up to date with modern methods of causal modeling.)

Causality as a subject has firmly and fully left the domain of philosophy. We now have a fully fleshed out framework of causal reasoning that is capable of answering all of the ancient philosophical questions and more. This is not to say that there is no more work to be done on understanding causality… just that this work is not going to be done by philosophers. It is going to be done by statisticians, computer scientists, and physicists.

Another area besides causality where I think this has happened is epistemology. Modern advances in epistemology are not coming out of the philosophy departments. They’re coming out of machine learning institutes and artificial intelligence researchers, who are working on turning the question of “how do we optimally come to justified beliefs in a posteriori matters?” into precise code-able algorithms.

I’m thinking about doing a series of posts called “X for philosophers”, in which I take an area of inquiry that has historically been the domain of philosophy, and explain how modern scientific methods have solved or are solving the central questions in this area.

For instance, here’s a brief guide to how to translate all the standard types of causal statements philosophers have debated for centuries into simple algebra problems:

 Causal model An ordered triple of exogenous variables, endogenous variables, and structural equations for each endogenous variable Causal diagram A directed acyclic graph representing a causal model, whose nodes represent the endogenous variables and whose edges represent the structural equations Causal relationship A directed edge in a causal diagram Causal intervention A mutilated causal diagram in which the edges between the intervened node and all its parent nodes are removed Probability of A if B P(A | B) Probability of A if we intervene on B P(A | do B) = P(AB) Probability that A would have happened, had B happened P(AB | -B) Probability that B is a necessary cause of A P(-A-B | A, B) Probability that B is a sufficient cause of A P(AB | -A, -B)

Right there is the guide to understanding the nature of causal relationships, and assessing the precise probabilities of causal conditional statements, counterfactual statements, and statements of necessary and sufficient causation.

To most philosophy students and professors, what I’ve written is probably chicken-scratch. But it is crucially important for them in order to not become obsolete in their causal thinking.

There’s an unhealthy tendency amongst some philosophers to, when presented with such chicken-scratch, dismiss it as not being philosophical enough and then go back to reading David Lewis’s arguments for the existence of possible worlds. It is this that, I think, is a large part of the scientist’s tendency to dismiss philosophers as outdated and intellectually behind the times. And it’s hard not to agree with them when you’ve seen both the crystal-clear beauty of formal causal modeling, and also the debates over things like how to evaluate the actual “distance” between possible worlds.

Artificial intelligence researcher extraordinaire Stuart Russell has said that he knew immediately upon reading Pearl’s book on causal modeling that it was going to change the world. Philosophy professors should either teach graph theory and Bayesian networks, or they should not make a pretense of teaching causality at all.

# Galileo and the Schelling point improbability principle

An alternative history interaction between Galileo and his famous statistician friend

＊＊＊

In the year 1609, when Galileo Galilei finished the construction of his majestic artificial eye, the first place he turned his gaze was the glowing crescent moon. He reveled in the crevices and mountains he saw, knowing that he was the first man alive to see such a sight, and his mind expanded as he saw the folly of the science of his day and wondered what else we might be wrong about.

For days he was glued to his telescope, gazing at the Heavens. He saw the planets become colorful expressive spheres and reveal tiny orbiting companions, and observed the distant supernova which Kepler had seen blinking into existence only five years prior. He discovered that Venus had phases like the Moon, that some apparently single stars revealed themselves to be binaries when magnified, and that there were dense star clusters scattered through the sky. All this he recorded in frantic enthusiastic writing, putting out sentences filled with novel discoveries nearly every time he turned his telescope in a new direction. The universe had opened itself up to him, revealing all its secrets to be uncovered by his ravenous intellect.

It took him two weeks to pull himself away from his study room for long enough to notify his friend Bertolfo Eamadin of his breakthrough. Eamadin was a renowned scholar, having pioneered at age 15 his mathematical theory of uncertainty and created the science of probability. Galileo often sought him out to discuss puzzles of chance and randomness, and this time was no exception. He had noticed a remarkable confluence of three stars that were in perfect alignment, and needed the counsel of his friend to sort out his thoughts.

Eamadin arrived at the home of Galileo half-dressed and disheveled, obviously having leapt from his bed and rushed over immediately upon receiving Galileo’s correspondence. He practically shoved Galileo out from his viewing seat and took his place, eyes glued with fascination on the sky.

Galileo allowed his friend to observe unmolested for a half-hour, listening with growing impatience to the ‘oohs’ and ‘aahs’ being emitted as the telescope swung wildly from one part of the sky to another. Finally, he interrupted.

Galileo: “Look, friend, at the pattern I have called you here to discuss.”

Galileo swiveled the telescope carefully to the position he had marked out earlier.

Eamadin: “Yes, I see it, just as you said. The three stars form a seemingly perfect line, each of the two outer ones equidistant from the central star.”

Galileo: “Now tell me, Eamadin, what are the chances of observing such a coincidence? One in a million? A billion?”

Eamadin frowned and shook his head. “It’s certainly a beautiful pattern, Galileo, but I don’t see what good a statistician like myself can do for you. What is there to be explained? With so many stars in the sky, of course you would chance upon some patterns that look pretty.”

Galileo: “Perhaps it seems only an attractive configuration of stars spewed randomly across the sky. I thought the same myself. But the symmetry seemed too perfect. I decided to carefully measure the central angle, as well as the angular distance distended by the paths from each outer star to the central one. Look.”

Galileo pulled out a sheet of paper that had been densely scribbled upon. “My calculations revealed the central angle to be precisely 180.000º, with an error of ± .003º. And similarly, I found the difference in the two angular distances to be .000º, with a margin of error of ± .002º.”

Galileo handed over the sheets to Eamadin. “I checked over my calculations a dozen times before writing you. I found the angular distances by approaching and retreating from this thin paper, which I placed between the three stars and me. I found the distance at which the thin paper just happened to cover both stars on one extreme simultaneously, and did the same for the two stars on the other extreme. The distance was precisely the same, leaving measurement error only for the thickness of the paper, my distance from it, and the resolution of my vision.”

Eamadin: “I see, I see. Yes, what you have found is a startlingly clear pattern. A similarity in distance and precision of angle this precise is quite unlikely to be the result of any natural phenomenon… ”

Galileo: “Exactly what I thought at first! But then I thought about the vast quantity of stars in the sky, and the vast number of ways of arranging them into groups of three, and wondered if perhaps in fact such coincidences might be expected. I tried to apply your method of uncertainty to the problem, and came to the conclusion that the chance of such a pattern having occurred through random chance is one in a thousand million! I must confess, however, that at several points in the calculation I found myself confronted with doubt about how to progress and wished for your counsel.”

Eamadin stared at Galileo’s notes, then pulled out a pad of his own and began scribbling intensely. Eventually, he spoke. “Yes, your calculations are correct. The chance of such a pattern having occurred to within the degree of measurement error you have specified by random forces is 10-9.”

Galileo: “Aha! Remarkable. So what does this mean? What strange forces have conspired to place the stars in such a pattern? And, most significantly, why?”

Eamadin: “Hold it there, Galileo. It is not reasonable to jump from the knowledge that the chance of an event is remarkably small to the conclusion that it demands a novel explanation.”

Galileo: “How so?”

Eamadin: “I’ll show you by means of a thought experiment. Suppose that we found that instead of the angle being 180.000º with an experimental error of .003º, it was 180.001º with the same error. The probability of this outcome would be the same as the outcome we found – one in a thousand million.”

Galileo: “That can’t be right. Surely it’s less likely to find a perfectly straight line than a merely nearly perfectly straight line.”

Eamadin: “While that is true, it is also true that the exact calculation you did for 180.000º ± .003º would apply for 180.001º ± .003º. And indeed, it is less likely to find the stars at this precise angle, than it is to find the stars merely near this angle. We must compare like with like, and when we do so we find that 180.000º is no more likely than any other angle!”

Galileo: “I see your reasoning, Eamadin, but you are missing something of importance. Surely there is something objectively more significant about finding an exactly straight line than about a nearly straight line, even if they have the same probability. Not all equiprobable events should be considered to be equally important. Think, for instance, of a sequence of twenty coin tosses. While it’s true that the outcome HHTHTTTTHTHHHTHHHTTH has the same probability as the outcome HHHHHHHHHHHHHHHHHHHH, the second is clearly more remarkable than the first.”

Eamadin: “But what is significance if disentangled from probability? I insist that the concept of significance only makes sense in the context of my theory of uncertainty. Significant results are those that either have a low probability or have a low conditional probability given a set of plausible hypotheses. It is this second class that we may utilize in analyzing your coin tossing example, Galileo. The two strings of tosses you mention are only significant to different degrees in that the second more naturally lends itself to a set of hypotheses in which the coin is heavily biased towards heads. In judging the second to be a more significant result than the first, you are really just saying that you use a natural hypothesis class in which probability judgments are only dependent on the ratios of heads and tails, not the particular sequence of heads and tails. Now, my question for you is: since 180.000º is just as likely as 180.001º, what set of hypotheses are you considering in which the first is much less likely than the second?”

Galileo: “I must confess, I have difficulty answering your question. For while there is a simple sense in which the number of heads and tails is a product of a coin’s bias, it is less clear what would be the analogous ‘bias’ in angles and distances between stars that should make straight lines and equal distances less likely than any others. I must say, Eamadin, that in calling you here, I find myself even more confused than when I began!”

Eamadin: “I apologize, my friend. But now let me attempt to disentangle this mess and provide a guiding light towards a solution to your problem.”

Eamadin: “Perhaps we may find some objective sense in which a straight line or the equality of two quantities is a simpler mathematical pattern than a nearly straight line or two nearly equal quantities. But even if so, this will only be a help to us insofar as we have a presumption in favor of less simple patterns inhering in Nature.”

Galileo: “This is no help at all! For surely the principle of Ockham should push us towards favoring more simple patterns.”

Eamadin: “Precisely. So if we are not to look for an objective basis for the improbability of simple and elegant patterns, then we must look towards the subjective. Here we may find our answer. Suppose I were to scribble down on a sheet of paper a series of symbols and shapes, hidden from your view. Now imagine that I hand the images to you, and you go off to some unexplored land. You explore the region and draw up cartographic depictions of the land, having never seen my images. It would be quite a remarkable surprise were you to find upon looking at my images that they precisely matched your maps of the land.”

Galileo: “Indeed it would be. It would also quickly lend itself to a number of possible explanations. Firstly, it may be that you were previously aware of the layout of the land, and drew your pictures intentionally to capture the layout of the land – that is, that the layout directly caused the resemblance in your depictions. Secondly, it could be that there was a common cause between the resemblance and the layout; perhaps, for instance, the patterns that most naturally come to the mind are those that resemble common geographic features. And thirdly, included only for completion, it could be that your images somehow caused the land to have the geographic features that it did.”

Eamadin: “Exactly! You catch on quickly. Now, this case of the curious coincidence of depiction and reality is exactly analogous to your problem of the straight line in the sky. The straight lines and equal distances are just like patterns on the slips of paper I handed to you. For whatever reason, we come pre-loaded with a set of sensitivities to certain visual patterns. And what’s remarkable about your observation of the three stars is that a feature of the natural world happens to precisely align with these patterns, where we would expect no such coincidence to occur!”

Galileo: “Yes, yes, I see. You are saying that the improbability doesn’t come from any objective unusual-ness of straight lines or equal distances. Instead, the improbability comes from the fact that the patterns in reality just happen to be the same as the patterns in my head!”

Eamadin: “Precisely. Now we can break down the suitable explanations, just as you did with my cartographic example. The first explanation is that the patterns in your mind were caused by the patterns in the sky. That is, for some reason the fact that these stars were aligned in this particular way caused you to by psychologically sensitive to straight lines and equal quantities.”

Galileo: “We may discard this explanation immediately, for such sensitivities are too universal and primitive to be the result of a configuration of stars that has only just now made itself apparent to me.”

Eamadin: “Agreed. Next we have a common cause explanation. For instance, perhaps our mind is naturally sensitive to visual patterns like straight lines because such patterns tend to commonly arise in Nature. This natural sensitivity is what feels to us on the inside as simplicity. In this case, you would expect it to be more likely for you to observe simple patterns than might be naively thought.”

Galileo: “We must deny this explanation as well, it seems to me. For the resemblance to a straight line goes much further than my visual resolution could even make out. The increased likelihood of observing a straight line could hardly be enough to outweigh our initial naïve calculation of the probability being 10-9. But thinking more about this line of reasoning, it strikes me that you have just provided an explanation the apparent simplicity of the laws of Nature! We have developed to be especially sensitive to patterns that are common in Nature, we interpret such patterns as ‘simple’, and thus it is a tautology that we will observe Nature to be full of simple patterns.”

Eamadin: “Indeed, I have offered just such an explanation. But it is an unsatisfactory explanation, insofar as one is opposed to the notion of simplicity as a purely subjective feature. Most people, myself included, would strongly suggest that a straight line is inherently simpler than a curvy line.”

Galileo: “I feel the same temptation. Of course, justifying a measure of simplicity that does the job we want of it is easier said than done. Now, on to the third explanation: that my sensitivity to straight lines has caused the apparent resemblance to a straight line. There are two interpretations of this. The first is that the stars are not actually in a straight line, and you only think this because of your predisposition towards identifying straight lines. The second is that the stars aligned in a straight line because of these predispositions. I’m sure you agree that both can be reasonably excluded.”

Eamadin: “Indeed. Although it may look like we’ve excluded all possible explanations, notice that we only considered one possible form of the common cause explanation. The other two categories of explanations seem more thoroughly ruled out; your dispositions couldn’t be caused by the star alignment given that you have only just found out about it and the star alignment couldn’t be caused by your dispositions given the physical distance.”

Galileo: “Agreed. Here is another common cause explanation: God, who crafted the patterns we see in Nature, also created humans to have similar mental features to Himself. These mental features include aesthetic preferences for simple patterns. Thus God causes both the salience of the line pattern to humans and the existence of the line pattern in Nature.”

Eamadin: “The problem with this is that it explains too much. Based solely on this argument, we would expect that when looking up at the sky, we should see it entirely populated by simple and aesthetic arrangements of stars. Instead it looks mostly random and scattershot, with a few striking exceptions like those which you have pointed out.”

Galileo: “Your point is well taken. All I can imagine now is that there must be some sort of ethereal force that links some stars together, gradually pushing them so that they end up in nearly straight lines.”

Eamadin: “Perhaps that will be the final answer in the end. Or perhaps we will discover that it is the whim of a capricious Creator with an unusual habit for placing unsolvable mysteries in our paths. I sometimes feel this way myself.”

Galileo: “I confess, I have felt the same at times. Well, Eamadin, although we have failed to find a satisfactory explanation for the moment, I feel much less confused about this matter. I must say, I find this method of reasoning by noticing similarities between features of our mind and features of the world quite intriguing. Have you a name for it?”

Eamadin: “In fact, I just thought of it on the spot! I suppose that it is quite generalizable… We come pre-loaded with a set of very salient and intuitive concepts, be they geometric, temporal, or logical. We should be surprised to find these concepts instantiated in the world, unless we know of some causal connection between the patterns in our mind and the patterns in reality. And by Eamadin’s rule of probability-updating, when we notice these similarities, we should increase our strength of belief in these possible causal connections. In the spirit of anachrony, let us refer to this as the Schelling point improbability principle!”

Galileo: “Sounds good to me! Thank you for your assistance, my friend. And now I must return to my exploration of the Cosmos.”