# What is integrated information?

Integrated information theory relates consciousness to degrees of integrated information within a physical system. I recently became interested in IIT and found it surprisingly hard to locate a good simple explanation of the actual mathematics of integrated information online.

Having eventually just read through all of the original papers introducing IIT, I discovered that integrated information is closely related to some of my favorite bits of mathematics, involving information theory and causal modeling.  This was exciting enough to me that I decided to write a guide to understanding integrated information. My goal in this post is to introduce a beginner to integrated information in a rigorous and (hopefully!) intuitive way.

I’ll describe it increasing levels of complexity, so that even if you eventually get lost somewhere in the post, you’ll be able to walk away having learned something. If you get to the end of this post, you should be able to sit down with a pencil and paper and calculate the amount of integrated information in simple systems, as well as how to calculate it in principle for any system.

## Level 1

So first, integrated information is a measure of the degree to which the components of a system are working together to produce outputs.

A system composed of many individual parts that are not interacting with each other in any way is completely un-integrated – it has an integrated information ɸ = 0. On the other hand, a system composed entirely of parts that are tightly entangled with one another will have a high amount of integrated information, ɸ >> 0.

For example, consider a simple model of a camera sensor.

This sensor is composed of many independent parts functioning completely separately. Each pixel stores a unit of information about the outside world, regardless of what its neighboring pixels are doing. If we were to somehow sever the causal connections between the two halves of the sensor, each half would still capture and store information in exactly the same way.

Now compare this to a human brain.

The nervous system is a highly entangled mesh of neurons, each interacting with many many neighbors in functionally important ways. If we tried to cut the brain in half, severing all the causal connections between the two sides, we would get an enormous change in brain functioning.

Makes sense? Okay, on to level 2.

## Level 2

So, integrated information has to do with the degree to which the components of a system are working together to produce outputs. Let’s delve a little deeper.

We just said that we can tell that the brain is integrating lots of information, because the functioning would be drastically disrupted if you cut it in half. A keen reader might have realized that the degree to which the functioning is disrupted will depend a lot on how you cut it in half.

For instance, cut off the front half of somebody’s brain, and you will end up with total dysfunction. But you can entirely remove somebody’s cerebellum (~50% of the brain’s neurons), and end up with a person that has difficulty with coordination and is a slow learner, but is otherwise a pretty ordinary person.

What this is really telling us is that different parts of the brain are integrating information differently. So how do we quantify the total integration of information of the brain? Which cut do we choose when evaluating the decrease in functioning?

Simple: We look at every possible way of partitioning the brain into two parts. For each one, we see how much the brain’s functioning is affected. Then we locate the minimum information partition, that is, the partition that results in the smallest change in brain functioning. The change in functioning that results from this particular partition is the integrated information!

Okay. Now, what exactly do we mean by “changes to the system’s functioning”? How do we measure this?

Answer: The functionality of a system is defined by the way in which the current state of the system constrains the past and future states of the system.

To make full technical sense of this, we have to dive a little deeper.

## Level 3

How many possible states are there of a Connect Four board?

(I promise this is relevant)

The board is 6 by 7, and each spot can be either a red piece, a black piece, or empty.

So a simple upper bound on the number of total possible board states is 342 (of course, the actual number of possible states will be much lower than this, since some positions are impossible to get into).

Now, consider what you know about the possible past and future states of the board if the board state is currently…

Clearly there’s only one possible past state:

And there are seven possible future states:

What this tells us is that the information about the current state of the board constrains the possible past and future states, selecting exactly one possible board out of the 342 possibilities for the past, and seven out of 342 possibilities for the future.

More generally, for any given system S we have a probability distribution over past and future states, given that the current state is X.

Pfuture(X, S) = Pr( Future state of S | Present state of S is X )
Ppast(X, S) = Pr( Past state of S | Present state of S is X )

For any partition of the system into two components, S1 and S2, we can consider the future and past distributions given that the states of the components are, respectively, X1 and X2, where X = (X1, X2).

Pfuture(X, S1, S2) = Pr( Future state of S1 | Present state of S1 is X1 )･Pr( Future state of S2 | Present state of S2 is X2 )
Ppast(X, S1, S2) = Pr( Past state of S1 | Present state of S1 is X1 )･Pr( Past state of S2 | Present state of S2 is X2 )

Now, we just need to compare our distributions before the partition to our distributions after the partition. For this we need some type of distance function D that assesses how far apart two probability distributions are. Then we define the cause information and the effect information for the partition (S1, S2).

Cause information = D( Ppast(X, S), Ppast(X, S1, S2) )
Effect information = D( Pfuture(X, S), Pfuture(X, S1, S2) )

In short, the cause information is how much the distribution over past states changes when you partition off your system into two separate systems And the future information is the change in the distribution over future states when you partition the system.

The cause-effect information CEI is then defined as the minimum of the cause information CI and effect information EI.

CEI = min{ CI, EI }

We’ve almost made it all the way to our full definition of ɸ! Our last step is to calculate the CEI for every possible partition of S into two pieces, and then select the partition that minimizes CEI (the minimum information partition MIP).

The integrated information is just the cause effect information of the minimum information partition!

ɸ = CEI(MIP)

## Level 4

We’ve now semi-rigorously defined ɸ. But to really get a sense of how to calculate ɸ, we need to delve into causal diagrams. At this point, I’m going to assume familiarity with causal modeling. The basics are covered in a series of posts I wrote starting here.

Here’s a simple example system:

This diagram tells us that the system is composed of two variables, A and B. Each of these variables can take on the values 0 and 1. The system follows the following simple update rule:

A(t + 1) = A(t) XOR B(t)
B(t + 1) = A(t) AND B(t)

We can redraw this as a causal diagram from A and B at time 0 to A and B at time 1:

What this amounts to is the following system evolution rule:

ABt → ABt+1
00        00
01       10
10       10
11       01

Now, suppose that we know that the system is currently in the state AB = 00. What does this tell us about the future and past states of the system?

Well, since the system evolution is deterministic, we can say with certainty that the next state of the system will be 00. And since there’s only one way to end up in the state 00, we know that the past state of the system 00.

We can plot the probability distributions over the past and future distributions as follows:

This is not too interesting a distribution… no information is lost or gained going into the past or future. Now we partition the system:

The causal diagram, when cut, looks like:

Why do we have the two “noise” variables? Well, both A and B take two variables as inputs. Since one of these causal inputs has been cut off, we replace it with a random variable that’s equally likely to be a 0 or a 1. This procedure is called “noising” the causal connections across the partition.

According to this diagram, we now have two independent distributions over the two parts of the system, A and B. In addition, to know the total future state of a system, we do the following:

P(A1, B1 | A0, B0) = P(A1 | A0) P(B1 | B0)

We can compute the two distributions P(A1 | A0) and P(B1 | B0) straightforwardly, by looking at how each variable evolves in our new causal diagram.

A0 = 0 ⇒ A1 = 0, 1 (½ probability each)
B0 = 0 ⇒ B1 = 0

A0 = 0 ⇒ A-1 = 0, 1 (½ probability each)
B0 = 0 ⇒ B-1 = 0, 1 (probabilities ⅔ and ⅓)

This implies the following probability distribution for the partitioned system:

I recommend you go through and calculate this for yourself. Everything follows from the updating rules that define the system and the noise assumption.

Good! Now we have two distributions, one for the full system and one for the partitioned system. How do we measure the difference between these distributions?

There are a few possible measures we could use. My favorite of these is the Kullback-Leibler divergence DKL. Technically, this metric is only used in IIT 2.0, not IIT 3.0 (which uses the earth-mover’s distance). I prefer DKL, as it has a nice interpretation as the amount of information lost when the system is partitioned. I have a post describing DKL here.

Here’s the definition of DKL:

DKL(P, Q) = ∑ Pi log(Pi / Qi)

We can use this quantity to calculate the cause information and the effect information:

Cause information = log(3) ≈ 1.6
Effect information = log(2) = 1

These values tell us that our partition destroys about .6 more bits of information about the past than it does the future. For the purpose of integrated information, we only care about the smaller of these two (for reasons that I don’t find entirely convincing).

Cause-effect information = min{ 1, 1.6 } = 1

Now, we’ve calculated the cause-effect information for this particular partition. And since our system has only two variables, this is the only possible partition.

The integrated information is the cause-effect information of the minimum information partition. Since our system only has two components, the partition we’ve examined is the only possible partition, meaning that it must be the minimum information partition. And thus, we’ve calculated ɸ for our system!

ɸ = 1

## Level 5

Let’s now define ɸ in full generality.

Our system S consists of a vector of N variables X = (X1, X2, X3, …, XN), each an element in some space 𝒳. Our system also has an updating rule, which is a function f: 𝒳N → 𝒳N. In our previous example, 𝒳 = {0, 1}, N = 2, and f(x, y) = (x XOR y, x AND y).

More generally, our updating rule f can map X to a probability distribution p:  𝒳N → . We’ll denote P(Xt+1 | Xt) as the distribution over the possible future states, given the current state. P is defined by our updating rule: P(Xt+1 | Xt) = f(Xt). The distribution over possible past states will be denoted P(Xt-1 | Xt). We’ll obtain this using Bayes’ rule: P(Xt-1 | Xt) = P(Xt | Xt-1) P(Xt-1) / P(Xt) = f(Xt-1) P(Xt-1) / P(Xt).

A partition of the system is a subset of {1, 2, 3, …, N}, which we’ll label A. We define B = {1, 2, 3, …, N} \ A. Now we can define XA = ( X)a∈A, and XB = ( X)b∈B. Loosely speaking, we can say that X = (XA, XB), i.e. that the total state is just the combination of the two partitions A and B.

We now define the distributions over future and past states in our partitioned system:

Q(Xt+1 | Xt) = P(XA, t+1 | XA, t) P(XB, t+1 | XB, t)
Q(Xt-1 | Xt) = P(XA, t-1 | XA, t) P(XB, t-1 | XB, t).

The effect information EI of the partition defined by A is the distance between P(Xt+1 | Xt) and Q(Xt+1 | Xt), and the cause information CI is defined similarly. The cause-effect information is defined as the minimum of these two.

CI(f, A, Xt) = D( P(Xt-1 | Xt), Q(Xt-1 | Xt) )
EI(f, A, Xt) = D( P(Xt+1 | Xt), Q(Xt+1 | Xt) )

CEI(f, A, Xt) = min{ CI(f, A, Xt), EI(f, A, Xt) }

And finally, we define the minimum information partition (MIP) and the integrated information:

MIP = argminA CEI(f, A, Xt)
ɸ(f, Xt) = minA CEI(f, A, Xt)
= CEI(f, MIP, Xt)

And we’re done!

Notice that our final result is a function of f (the updating function) as well as the current state of the system. What this means is that the integrated information of a system can change from moment to moment, even if the organization of the system remains the same.

By itself, this is not enough for the purposes of integrated information theory. Integrated information theory uses ɸ to define gradations of consciousness of systems, but the relationship between ɸ and consciousness isn’t exactly one-to-on (briefly, consciousness resides in non-overlapping local maxima of integrated information).

But this post is really meant to just be about integrated information, and the connections to the theory of consciousness are actually less interesting to me. So for now I’ll stop here! 🙂

# How to sort a list of numbers

It’s funny how some of the simplest questions you can think of actually have interesting and nontrivial answers.

Take this question, for instance. If I hand you a list of N numbers, how can you quickly sort this list from smallest to largest? Suppose that you are only capable of sorting two numbers at a time.

You might immediately think of some obvious ways to solve this problem. For instance, you could just start at the beginning of the list and repeatedly traverse it, swapping numbers if they’re in the wrong order. This will guarantee that each pass-through, the largest number out of those not yet sorted will “bubble up” to the top. This algorithm is called Bubble Sort.

The problem is that for a sizable list, this will take a long long time. On average, for a list of size N, Bubble Sort takes O(N^2) steps to finish sorting. Can you think of a faster algorithm?

It turns out that we can go much faster with some clever algorithms – from O(N^2) to O(N logN). If you have a little time to burn, here are some great basic videos describing and visualizing different sorting techniques:

## Visualizations

Beautiful visualization of Bubble, Shell, and Quick Sort

(I’d be interested to know why the visualization of Bubble Sort in the final frame gave a curve of unsorted values that looked roughly logarithmic)

Visualization of 9 sorting algorithms

Auditory presentation of 15 sorting algorithms

## Bonus clip…

Cool proof of the equivalence of Selection Sort and Insertion Sort

## Searching a list

By the way, how about if you have a sorted list and want to find a particular value in it? If we’re being dumb, we could just ignore the fact that our list came pre-sorted and search through every element on the list in order. We’d find our value in O(N). We can go much faster by just starting in the middle of our list, looking at the size of the middle value to determine which half of the list to look at next, and then repeating with the appropriate half of the list. This is called binary search and takes O(logN), a massive speedup.

Now, let’s look back at the problem of finding a particular value in an unsorted list. Can we think of any techniques to accomplish this more quickly than O(N)?

No. Try as you might, as long as the list has no structure for you to exploit, your optimal performance will be limited to O(N).

Or so everybody thought until quantum mechanics showed up and blew our minds.

It turns out that a quantum computer would be able to solve this exact problem – searching through an unsorted list – in only O(√N) steps!

This should seem impossible to you. If a friend hands you a random jumbled list of 10,000 names and asks you to locate one particular name on the list, you’re going to end up looking through 5,000 names on average. There’s no clever trick you can use to speed up the process. Except that quantum mechanics says you’re wrong! In fact, if you were a quantum computer, you’d only have to search through ~100 names to find your name.

This quantum algorithm is called Grover’s algorithm, and I want to write up a future post about it. But that’s not even the coolest part! It turns out that if a non-local hidden variable interpretation of quantum mechanics is correct, then a quantum computer could search an N-item database in O(∛N)! You can imagine the triumphal feeling of the hidden variables people if one day they managed to build a quantum computer that simultaneously proved them right and broke the record for search algorithm speed!

# Defining racism

How would you define racism?

I’ve been thinking about this lately in light of some of the scandal around research into race and IQ. It’s a harder question than I initially thought; many of the definitions that pop to mind end up being either too strong or too weak. The term also functions differently in different contexts (e.g. personal racism, institutional racism, racist policies). In this post, I’m specifically talking about personal racism – that term we use to refer to the beliefs and attitudes of those like Nazis or Ku Klux Klan members (at the extreme end).

I’m going to walk through a few possible definitions. This will be fairly stream-of-consciousness, so I apologize if it’s not incredibly profound or well-structured.

Definition 1 Racism is the belief in the existence of inherent differences between the races.

‘Inherent’ is important, because we don’t want to say that somebody is racist for acknowledging differences that can ultimately be traced back to causes like societal oppression. The problem with this definition is that, well, there are inherent differences between the races.

The Chinese are significantly shorter than the Dutch. Raising a Chinese person in a Dutch household won’t do much to equalize this difference. What’s important, it seems, is not the belief in the existence of inherent differences, but instead the belief in the existence of inherent inferiorities and superiorities. So let’s try again.

Definition 2 Racism is the belief in the existence of inherent racial differences that are normatively significant.

This is pretty much the dictionary definition of the term “racism”. While it’s better, there are still some serious problems. Let’s say that somebody discovered that the Slavs are more inherently prone to violence than, say, Arabs. Suppose that somebody ran across this fact, and that this person also held the ethical view that violent tendencies are normatively important. That is, they think that peaceful people are ethically superior to violent people.

If they combine this factual belief with this seemingly reasonable normative belief, they’ll end up being branded as a racist, by our second definition. This is clearly undesirable… given that the word ‘racism’ is highly normatively loaded, we don’t want it to be the case that somebody is racist for believing true things. In other words, we probably don’t want our definition of racism to ever allow it to be the right attitude to take, or even a reasonable attitude to take.

Maybe the missing step is the generalization of attitudes about Slavs and Arabs to individuals. This is a sentiment that I’ve heard fairly often… racism is about applying generalizations about groups to individuals (for instance, racial profiling). Let’s formalize this:

Definition 3 Racism is about forming normative judgments about individuals’ characteristics on the basis of beliefs about normative group-level differences.

This sounds nice and all, but… you know what another term for “applying facts about groups to individuals” is? Good statistical reasoning.

If you live in a town composed of two distinct populations, the Hebbeberans and the Klabaskians, and you know that Klabaskians are on average twenty times more likely than Hebbeberans to be fatally allergic to cod, then you should be more cautious with serving your extra special cod sandwich to a Klabaskian friend than to a Hebbeberan.

Facts about populations do give you evidence about individuals within those populations, and the mere acknowledgement of this evidence is not racist, for the same reason that rationality is not racist.

So if we don’t want to call rationality racist, then maybe our way out of this is to identify racism with irrationality.

Definition 4 Racism is the holding of irrational beliefs about normative racial differences.

Say you meet somebody from Malawi (a region with an extremely low average IQ). Your first rational instinct might be to not expect too much from them in the way of cognitive abilities. But now you learn that they’re a theoretical physicist who’s recently been nominated for a Nobel prize for their work in quantum information theory. If the average IQ of Malawians is still factoring in at all to your belief about this person’s intelligence, then you’re being racist.

I like this definition a lot better than our previous ones. It combines the belief in racial superiority with irrationality. On the other hand, it has problems as well. One major issue is that there are plenty of cases of benign irrationality, where somebody is just a bad statistical reasoner, but not motivated by any racial hatred. Maybe they over-updated on some piece of information, because they failed to take into account an important base-rate.

Well, the base-rate fallacy is one of the most common cognitive biases out there. Surely this isn’t enough to make them a racist? What we want is to capture the non-benign brand of irrational normative beliefs about race – those that are motivated by hatred or prejudice.

Definition 5 Racism is the holding of irrational normative beliefs about racial differences, motivated by racial hatred or prejudice.

I think this does the best at avoiding making the category too large, but it may be too strong and keep out some plausible cases of racism. I’d like to hear suggestions for improvements on this definition, but for now I’ll leave it there. One potential take-away is that the word ‘racism’ is a nasty combination of highly negatively charged and ambiguous, and that such words are best treated with caution, especially when applied them to edge cases.

I’m starting to get a sense of why people like David Chalmers and Daniel Dennett call consciousness the most mysterious thing known to humans. I’m currently just really confused, and think that pretty much every position available with respect to consciousness is deeply unsatisfactory. In this post, I’ll just walk through my recent thinking.

## Against physicalism

In a previous post, I imagined a scientist from the future who told you they had a perfected theory of consciousness, and asked how we could ask for evidence confirming this. This theory of consciousness could presumably be thought of as a complete mapping from physical states to conscious states – a set of psychophysical laws. Questions about the nature of consciousness are then questions about the nature of these laws. Are they ultimately the same kind of laws as chemical laws (derivable in principle from the underlying physics)? Or are they logically distinct laws that must be separately listed on the catalogue of the fundamental facts about the universe?

I take physicalism to be the stance that answers ‘yes’ to the first question and ‘no’ to the second. Dualism and epiphenomenalism answer ‘no’ to the first and ‘yes’ to the second, and are distinguished by the character of the causal relationships between the physical and the conscious entailed by the psychophysical laws.

So, is physicalism right? Imagining that we had a perfect mapping from physical states to conscious states, would this mapping be in principle derivable from the Schrodinger equation? I think the answer to this has to be no; whatever the psychophysical laws are, they are not going to be in principle derivable from physics.

To see why, let’s examine what it looks like when we derive macroscopic laws from microscopic laws. Luckily, we have a few case studies of successful reduction. For instance, you can start with just the Schrodinger equation and derive the structure of the periodic table. In other words, the structure and functioning of atoms and molecules naturally pops out when you solve the equation for systems of many particles.

You can extrapolate this further to larger scale systems. When we solve the Schrodinger equation for large systems of biomolecules, we get things like enzymes and cell membranes and RNA, and all of the structure and functioning corresponding to our laws of biology. And extending this further, we should expect that all of our behavior and talk about consciousness will be ultimately fully accounted for in terms of purely physical facts about the structure of our brain.

The problem is that consciousness is something more than just the words we say when talking about consciousness. While it’s correlated in very particular ways with our behavior (the structure and functioning of our bodies), it is by its very nature logically distinct from these. You can tell me all about the structure and functioning of a physical system, but the question of whether or not it is conscious is a further fact that is not logically entailed. The phrase LOGICALLY entailed is very important here – it may be that as a matter of fact, it is a contingent truth of our universe that conscious facts always correspond to specific physical facts. But this is certainly not a relationship of logical entailment, in the sense that the periodic table is logically entailed by quantum mechanics.

In summary, it looks like we have a problem on our hands if we want to try to derive facts about consciousness from facts about fundamental physics. Namely, the types of things we can derive from something like the Schrodinger equation are facts about complex macroscopic structure and functioning. This is all well and good for deriving chemistry or solid-state physics from quantum mechanics, as these fields are just collections of facts about structure and functioning. But consciousness is an intrinsic property that is logically distinct from properties like macroscopic structure and functioning. You simply cannot expect to start with the Schrodinger equation and naturally arrive at statements like “X is experiencing red” or “Y is feeling sad”, since these are not purely behavioral statements.

Here’s a concise rephrasing of the argument I’ve made, in terms of a trilemma. Out of the following three postulates, you cannot consistently accept all three:

1. There are facts about consciousness.
2. Facts about consciousness are not logically entailed by the Schrodinger equation (substitute in whatever the fundamental laws of physics end up being).

Denying (1) makes you an eliminativist. Presumably this is out of the question; consciousness is the only thing in the universe that we can know with certainty exists, as it is the only thing that we have direct first-person access to. Indeed, all the rest of our knowledge comes to us by means of our conscious experience, making it in some sense the root of all of our knowledge. The only charitable interpretations I have of eliminativism involve semantic arguments subtly redefining what we mean by “consciousness” away from “that thing which we all know exists from first-hand experience” to something whose existence can actually be cast doubt on.

Denying (2) seems really implausible to me for the considerations given above.

So denying (3) looks like our only way out.

Okay, so let’s suppose physicalism is wrong. This is already super important. If we accept this argument, then we have a worldview in which consciousness is of fundamental importance to the nature of reality. The list of fundamental facts about the universe will be (1) the laws of physics and (2) the laws of consciousness. This is really surprising for anybody like me that professes a secular worldview that places human beings far from the center of importance in the universe.

But “what about naturalism?” is not the only objection to this position. There’s a much more powerful argument.

## Against non-physicalism

Suppose we now think that the fundamental facts about the universe fall into two categories: P (the fundamental laws of physics, plus the initial conditions of the universe) and Q (the facts about consciousness). We’ve already denied that P = Q or that there is a logical entailment relationship from P to Q.

Now we can ask about the causal nature of the psychophysical laws. Does P cause Q? Does Q cause P? Does the causation go both ways?

First, conditional on the falsity of physicalism, we can quickly rule out theories that claim that Q causes P (i.e. dualist theories). This is the old Cartesian picture that is unsatisfactory exactly because of the strength of the physical laws we’ve discovered. In short, physics appears to be causally complete. If you fix the structure and functioning on the microscopic level, then you fix the structure and functioning on the macroscopic level. In the language of philosophy, macroscopic physical facts supervene upon microscopic physical facts.

But now we have a problem. If all of our behavior and functioning is fully causally accounted for by physical facts, then what is there for Q (consciousness) to play a causal role in? Precisely nothing!

We can phrase this in the following trilemma (again, all three of these cannot be simultaneously true):

1. Physicalism is false.
2. Physics is causally closed.
3. Consciousness has a causal influence on the physical world.

Okay, so now we have ruled out any theories in which Q causes P. But now we reach a new and even more damning conclusion. Namely, if facts about consciousness have literally no causal influence on any aspect of the physical world, then they have no causal influence, in particular, on your thoughts and beliefs about your consciousness.

Stop to consider for a moment the implications of this. We take for granted that we are able to form accurate beliefs about our own conscious experiences. When we are experiencing red, we are able to reliably produce accurate beliefs of the form “I am experiencing red.” But if the causal relationship goes from P to Q, then this becomes extremely hard to account for.

What would we expect to happen if our self-reports of our consciousness fell out of line with our actual consciousness? Suppose that you suddenly noticed yourself verbalizing “I’m really having a great time!” when you actually felt like you were in deep pain and discomfort. Presumably the immediate response you would have would be confusion, dismay, and horror. But wait! All of these experiences must be encoded in your brain state! In other words, to experience horror at the misalignment of your reports about your consciousness and your actual consciousness, it would have to be the case that your physical brain state would change in a particular way. And a necessary component of the explanation for this change would be the actual state of your consciousness!

This really gets to the heart of the weirdness of epiphenomenalism (the view that P causes Q, but Q doesn’t causally influence P). If you’re an epiphenomenalist, then all of your beliefs and speculations about consciousness are formed exactly as they would be if your conscious state were totally different. The exact same physical state of you thinking “Hey, this coffee cake tastes delicious!” would arise even if the coffee cake actually tasted like absolute shit.

This is completely absurd. On the epiphenomenalist view, any correlation between the beliefs you form about consciousness and the actual facts about your conscious state couldn’t possibly be explained by the actual facts about your consciousness. So they must be purely coincidental.

In other words, the following two statements cannot be simultaneously accepted:

• Consciousness does not causally influence our behavior.
• Our beliefs about our conscious states are more accurate than random guessing.

## So where does that leave us?

It leaves us in a very uncomfortable place. First of all, we should deny physicalism. But the denial of physicalism leaves us with two choices: either Q causes P or it does not.

We should deny the first, because otherwise we are accepting the causal incompleteness of physics.

And we should deny the second, because it leads us to conclude that essentially all of our beliefs about our conscious experiences are almost certainly wrong, undermining all of our reasoning that led us here in the first place.

So here’s a summary of this entire post so far. It appears that the following four statements cannot all be simultaneously true. You must pick at least one to reject.

1. There are facts about consciousness.
2. Facts about consciousness are not logically entailed by the Schrodinger equation (substitute in whatever the fundamental laws of physics end up being).
3. Physics is causally closed.
4. Our beliefs about our conscious states are more accurate than random guessing.

Eliminativists deny (1).

Physicalists deny (2).

Dualists deny (3).

And epiphenomenalists must deny (4).

I find that the easiest to deny of these four is (2). This makes me a physicalist, but not because I think that physicalism is such a great philosophical position that everybody should hold. I’m a physicalist because it seems like the least horrible of all the horrible positions available to me.

## Counters and counters to those counters

A response that I would have once given when confronted by these issues would be along the lines of: “Look, clearly consciousness is just a super confusing topic. Most likely, we’re just thinking wrong about the whole issue and shouldn’t be taking the notion of consciousness so seriously.”

Part of this is right. Namely, consciousness is a super confusing topic. But it’s important to clearly delineate between which parts of consciousness are confusing and which parts are not. I’m super confused about how to make sense of the existence of consciousness, how to fit consciousness into my model of reality, and how to formalize my intuitions about the nature of consciousness. But I’m definitively not confused about the existence of consciousness itself. Clearly consciousness, in the sense of direct first-person experience, exists, and is a property that I have. The confusion arises when we try to interpret this phenomenon.

In addition, “X is super confusing” might be a true statement and a useful acknowledgment, but it doesn’t necessarily push us in one direction over another when considering alternative viewpoints on X. So “X is super confusing” isn’t evidence for “We should be eliminativists about X” over “We should be realists about X.” All it does is suggest that something about our model of reality needs fixing, without pointing to which particular component it is that needs fixing.

One more type of argument that I’ve heard (and maybe made in the past, to my shame) is a “scientific optimism” style of argument. It goes:

Look, science is always confronted with seemingly unsolvable mysteries.  Brilliant scientists in each generation throw their hands up in bewilderment and proclaim the eternal unsolvability of the deep mystery of their time. But then a few generations later, scientists end up finding a solution, and putting to shame all those past scientists that doubted the power of their art.

Consciousness is just this generation’s “great mystery.” Those that proclaim that science can never explain the conscious in terms of the physical are wrong, just as Lord Kelvin was wrong when he affirmed that the behavior of living organisms cannot be explained in terms of purely physical forces, and required a mysterious extra element (the ‘vital principle’ as he termed it).

I think that as a general heuristic, “Science is super powerful and we should be cautious before proclaiming the existence of specific limits on the potential of scientific inquiry” is pretty damn good.

But at the same time, I think that there are genuinely good reasons, reasons that science skeptics in the past didn’t have, for affirming the uniqueness of consciousness in this regard.

Lord Kelvin was claiming that there were physical behaviors that could not be explained by appeal to purely physical forces. This is a very different claim from the claim that there are phenomenon that are not purely logically reducible to structural properties of matter, that cannot be explained by purely physical forces. This, it seems to me, is extremely significant, and gets straight to the crux of the central mystery of consciousness.

# Getting evidence for a theory of consciousness

I’ve been reading about the integrated information theory of consciousness lately, and wondering about the following question. In general, what are the sources of evidence we have for a theory of consciousness?

One way to think about this is to imagine yourself teleported hundreds of years into the future and talking to a scientist in this future world. This scientist tells you that in his time, consciousness is fully understood. What sort of experiments would you expect to be able to run to verify for yourself that the future’s theory of consciousness really is sufficient?

One thing you could do is point to a bunch of different physical systems, ask the scientist what his theory of consciousness says about them, and compare them to your intuitions. So, for instance, does the theory say that you are conscious? What about humans in general? What about people in deep sleep? How about dogs? Chickens? Frogs? Insects? Bacterium? Are Siri-style computer programs conscious? What about a rock? And so on.

The obvious problem with this is that it assumes the validity of your intuitions about consciousness. Sure it seems obvious that a rock is not conscious, that humans generally are, and that dogs are conscious, but less so than humans, but how do we know that these are trustworthy intuitions?

I think the validity of these intuitions is necessarily grounded in our phenomenology and our observations of how it correlates with our physical substance. So, for instance, I notice that when I fall asleep, my consciousness fades in and out. On the other hand, when I wiggle my big toe, this has an effect on the character of my conscious experience, but doesn’t shut it off entirely. This tells me that something about what happens to my body when I fall asleep is relevant to the maintenance of my consciousness, while the angle of my big toe is not.

In general, we make many observations like these and piece together a general theory of how consciousness relates to the physical world, not just in terms of the existence of consciousness, but also in terms of what specific conscious experiences we expect for a given change to our physical system. It tells us, for instance, that receiving a knock on the head or drinking too much alcohol is sometimes sufficient to temporarily suspend consciousness, while breaking a finger or cutting your hair is not.

Now, since we are able to intervene on our physical body at will and observe the results, our model is a causal model. An implication of this is that it should be able to handle counterfactuals. So, for instance, it can give us an answer to the question “Would I still be conscious if I cut my hair off, changed my skin color, shrunk several inches in height, and got a smaller nose?” This answer is presumably yes, because our theory distinguishes between physical features that are relevant to the existence of consciousness and those that are not.

Extending this further, we can ask if we would still be conscious if we gradually morphed into another human being, with a different brain and body. Again, the answer would appear to be yes, as long as nothing essential to the existence of consciousness is severed along the way. But now we are in a position to be able to make inferences about the existence of consciousness in bodies outside our own! For if I think that I would be conscious if I slowly morphed into my boyfriend, then I should also believe that my boyfriend is conscious himself. I could deny this by denying that the same physical states give rise to the same conscious states, but while this is logically possible, it seems quite implausible.

This gives rational grounds for our belief in the existence of consciousness in other humans, and allows us justified access to all of the work in neuroscience analyzing the connection between the brain and consciousness. It also allows us to have a baseline level of trust in the self-reports of other people about their conscious experiences, given the observation that we are generally reliable reporters of our conscious experience.

Bringing this back to our scientist from the future, I can think of some much more convincing tests I would do than the ‘tests of intuition’ that we did at first. Namely, suppose that the scientist was able to take any description of an experience, translate that into a brain state, and then stimulate your brain in such a way as to produce that experience for you. So over and over you submit requests – “Give me a new color experience that I’ve never had before, but that feels vaguely pinkish and bluish, with a high pitch whine in the background”, “Produce in me an emotional state of exaltation, along with the sensation of warm wind rushing through my hair and a feeling of motion”, etc – and over and over the scientist is able to excellently match your request. (Also, wow imagine how damn cool this would be if we could actually do this.)

You can also run the inverse test: you tell the scientist the details of an experience you are having while your brain is being scanned (in such a way that the scientist cannot see it). Then the scientist runs some calculations using their theory of consciousness and makes some predictions about what they’ll see on the brain scan. Now you check the brain scan to see if their predictions have come true.

To me, repeated success in experiments of this kind would be supremely convincing. If a scientist of the future was able to produce at will any experience I asked for (presuming my requests weren’t too far out as to be physical impossible), and was able to accurately translate facts about my consciousness into facts about my brain, and could demonstrate this over and over again, I would be convinced that this scientist really does have a working theory of consciousness.

And note that since this is all rooted in phenomenology, it’s entirely uncoupled from our intuitive convictions about consciousness! It could turn out that the exact framework the scientist is using to calculate the connections between my physical body and my consciousness end up necessarily entailing that rocks are conscious and that dolphins are not. And if the framework’s predictive success had been demonstrated with sufficient robustness before, I would just have to accept this conclusion as unintuitive but true. (Of course, it would be really hard to imagine how any good theory of consciousness could end up coming to this conclusion, but that’s beside the point.)

So one powerful source of evidence we have for testing a theory of consciousness is the correlations between our physical substance and our phenomenology. Is that all, or are there other sources of evidence tout there?

We can straightforwardly adopt some principles from the philosophy of science, such as the importance of simplicity and avoiding overfitting in formulating our theories. So for instance, one theory of consciousness might just be an exhaustive list of every physical state of the brain and what conscious experience this corresponds to. In other words, we could imagine a theory in which all of the basic phenomenological facts of consciousness are taken as individual independent axioms. While this theory will be fantastically accurate, it will be totally worthless to us, and we’d have no reason to trust its predictive validity.

So far, we really just have three criteria for evidence:

1. Correlations between phenomenology and physics
2. Simplicity
3. Avoiding overfitting

As far as I’m concerned, this is all that I’m really comfortable with counting as valid evidence. But these are very much not the only sources of evidence that get referenced in the philosophical literature. There are a lot of arguments that get thrown around concerning the nature of consciousness that I find really hard to classify neatly, although often these arguments feel very intuitively appealing. For instance, one of my favorite arguments for functionalism is David Chalmers’ ‘Fading Qualia’ argument. It goes something like this:

Imagine that scientists of the future are able to produce silicon chips that are functionally identical to neurons and can replicate all of their relevant biological activity. Now suppose that you undergo an operation in which gradually, every single part of your nervous system is substituted out for silicon. If the biological substrate implementing the functional relationships is essential to consciousness, then by the end of this procedure you will no longer be conscious.

But now we ask: when did the consciousness fade out? Was it a sudden or a gradual process? Both seem deeply implausible. Firstly, we shouldn’t expect a sudden drop-out of consciousness from the removal of a single neuron or cluster of neurons, as this would be a highly unusual level of discreteness. This would also imply the ability to switch on and off the entirety of your consciousness with seemingly insignificant changes to the biological structure of your nervous system.

And secondly, if it is a gradual process, then this implies the existence of “pseudo-conscious” states in the middle of the procedure, where your experiences are markedly distinct from those of the original being but you are pretty much always wrong about your own experiences. Why? Well, the functional relationships have stayed the same! So your beliefs about your conscious states, the memories you form, the emotional reactions you have, will all be exactly as if there has been no change to your conscious states. This seems totally bizarre and, in Chalmers’ words, “we have little reason to believe that consciousness is such an ill-behaved phenomenon.”

Now, this is a fairly convincing argument to me. But I have a hard time understanding why it should be. The argument’s convincingness seems to rely on some very high-level abstract intuitions about the types of conscious experiences we imagine organisms could be having, and I can’t think of a great reason for trusting these intuitions. Maybe we could chalk it up to simplicity, and argue that the notion of consciousness entailed by substrate-dependence must be extremely unparsimonious. But even this connection is not totally clear to me.

A lot of the philosophical argumentation about consciousness feels this way to me; convincing and interesting, but hard to make sense of as genuine evidence.

One final style of argument that I’m deeply skeptical of is arguments from pure phenomenology. This is, for instance, how Giulio Tononi likes to argue for his integrated information theory of consciousness. He starts from five supposedly self-evident truths about the character of conscious experience, then attempts to infer facts about the structure of the physical systems that could produce such experiences.

I’m not a big fan of Tononi’s observations about the character of consciousness. They seem really vaguely worded and hard enough to make sense of that I have no idea if they’re true, let alone self-evident. But it is his second move that I’m deeply skeptical of. The history of philosophers trying to move from “self-evident intuitive truths” to “objective facts about reality” is pretty bad. While we might be plenty good at detailing our conscious experiences, trying to make the inferential leap to the nature of the connection between physics and consciousness is not something you can do just by looking at phenomenology.

# The Scourge of Our Time

Human life must be respected and protected absolutely from the moment of conception. From the first moment of his existence, a human being must be recognized as having the rights of a person – among which is the inviolable right of every innocent being to life.

Since it must be treated from conception as a person, the embryo must be defended in its integrity, cared for, and healed, as far as possible, like any other human being.

Catechism of the Catholic Church, #2270, 2274

In this paper, Toby Ord advances a strong reductio ad absurdum of the standard pro-life position that life begins at conception. I’ve heard versions of this argument before, but hadn’t seen it laid out so clearly.

Here’s the argument:

1. The majority (~62%) of embryos die within a few weeks of conception (mostly from failure to implant in the lining of the uterus wall). A mother of three children could be expected to also have had five spontaneous abortions.
2. The Catholic Church promotes the premise that an embryo at conception has the same moral worth as a developed human. On this view, more than 60% of the world population dies in their first month of life, making this a more deadly condition than anything else in human history. Saving even 5% of embryos would save more lives than a cure for cancer.

3. Given the 200 million lives per year at stake, those that think life begins at conception should be directing massive amounts of resources towards ending spontaneous abortion and see it as the Scourge of our time.

Here are two graphs of the US survival curve: first, as we ordinarily see it, and second, as the pro-lifer is obligated to see it:

This is of course a really hard bullet for the pro-life camp to bite. If you’re like me, you see spontaneous abortions as morally neutral. Most of the time they happen before a pregnancy has been detected, leaving the mother unaware that anything even happened. It’s hard then to make a distinction between the enormous amount of spontaneous abortions naturally occurring and the comparatively minuscule number of intentional abortions.

I have previously had mixed feelings about abortion (after all, if our moral decision making ultimately comes down to trying to maximize some complicated expected value, it will likely be blind to whether is a real living being or just a “potential” living being), but this argument pretty much clinches the deal for me.

# The problem with philosophy

(Epistemic status: I have a high credence that I’m going to disagree with large parts of this in the future, but it all seems right to me at present. I know that’s non-Bayesian, but it’s still true.)

Philosophy is great. Some of the clearest thinkers and most rational people I know come out of philosophy, and many of my biggest worldview-changing moments have come directly from philosophers. So why is it that so many scientists seem to feel contempt towards philosophers and condescension towards their intellectual domain? I can actually occasionally relate to the irritation, and I think I understand where some of it comes from.

Every so often, a domain of thought within philosophy breaks off from the rest of philosophy and enters the sciences. Usually when this occurs, the subfield (which had previously been stagnant and unsuccessful in its attempts to make progress) is swiftly revolutionized and most of the previous problems in the field are promptly solved.

Unfortunately, what also often happens is that the philosophers that were previously working in the field are often unaware of or ignore the change in their field, and end up wasting a lot of time and looking pretty silly. Sometimes they even explicitly challenge the scientists at the forefront of this revolution, like Henri Bergson did with Einstein after he came out with his pesky new theory of time that swept away much of the past work of philosophers in one fell swoop.

Next you get a generation of philosophy students that are taught a bunch of obsolete theories, and they are later blindsided when they encounter scientists that inform them that the problems they’re working on have been solved decades ago. And by this point the scientists have left the philosophers so far in the dust that the typical philosophy student is incapable of understanding the answers to their questions without learning a whole new area of math or something. Thus usually the philosophers just keep on their merry way, asking each other increasingly abstruse questions and working harder and harder to justify their own intellectual efforts. Meanwhile scientists move further and further beyond them, occasionally dropping in to laugh at their colleagues that are stuck back in the Middle Ages.

Part of why this happens is structural. Philosophy is the womb inside which develops the seeds of great revolutions of knowledge. It is where ideas germinate and turn from vague intuitions and hotbeds of conceptual confusion into precisely answerable questions. And once these questions are answerable, the scientists and mathematicians sweep in and make short work of them, finishing the job that philosophy started.

I think that one area in which this has happened is causality.

Statisticians now know how to model causal relationships, how to distinguish them from mere regularities, how to deal with common causes and causal pre-emption, how to assess counterfactuals and assign precise probabilities to these statements, and how to compare different causal models and determine which is most likely to be true.

(By the way, guess where I came to be aware of all of this? It wasn’t in the metaphysics class in which we spent over a month discussing the philosophy of causation. No, it was a statistician friend of mine who showed me a book by Judea Pearl and encouraged me to get up to date with modern methods of causal modeling.)

Causality as a subject has firmly and fully left the domain of philosophy. We now have a fully fleshed out framework of causal reasoning that is capable of answering all of the ancient philosophical questions and more. This is not to say that there is no more work to be done on understanding causality… just that this work is not going to be done by philosophers. It is going to be done by statisticians, computer scientists, and physicists.

Another area besides causality where I think this has happened is epistemology. Modern advances in epistemology are not coming out of the philosophy departments. They’re coming out of machine learning institutes and artificial intelligence researchers, who are working on turning the question of “how do we optimally come to justified beliefs in a posteriori matters?” into precise code-able algorithms.

I’m thinking about doing a series of posts called “X for philosophers”, in which I take an area of inquiry that has historically been the domain of philosophy, and explain how modern scientific methods have solved or are solving the central questions in this area.

For instance, here’s a brief guide to how to translate all the standard types of causal statements philosophers have debated for centuries into simple algebra problems:

 Causal model An ordered triple of exogenous variables, endogenous variables, and structural equations for each endogenous variable Causal diagram A directed acyclic graph representing a causal model, whose nodes represent the endogenous variables and whose edges represent the structural equations Causal relationship A directed edge in a causal diagram Causal intervention A mutilated causal diagram in which the edges between the intervened node and all its parent nodes are removed Probability of A if B P(A | B) Probability of A if we intervene on B P(A | do B) = P(AB) Probability that A would have happened, had B happened P(AB | -B) Probability that B is a necessary cause of A P(-A-B | A, B) Probability that B is a sufficient cause of A P(AB | -A, -B)

Right there is the guide to understanding the nature of causal relationships, and assessing the precise probabilities of causal conditional statements, counterfactual statements, and statements of necessary and sufficient causation.

To most philosophy students and professors, what I’ve written is probably chicken-scratch. But it is crucially important for them in order to not become obsolete in their causal thinking.

There’s an unhealthy tendency amongst some philosophers to, when presented with such chicken-scratch, dismiss it as not being philosophical enough and then go back to reading David Lewis’s arguments for the existence of possible worlds. It is this that, I think, is a large part of the scientist’s tendency to dismiss philosophers as outdated and intellectually behind the times. And it’s hard not to agree with them when you’ve seen both the crystal-clear beauty of formal causal modeling, and also the debates over things like how to evaluate the actual “distance” between possible worlds.

Artificial intelligence researcher extraordinaire Stuart Russell has said that he knew immediately upon reading Pearl’s book on causal modeling that it was going to change the world. Philosophy professors should either teach graph theory and Bayesian networks, or they should not make a pretense of teaching causality at all.

# Argument screens off identity

In his post Argument Screens Off Authority, Eliezer Yudkowsky argues that while a person’s credentials can give you evidence as to the quality of arguments you expect to hear from them, these conclusions becomes irrelevant to the truth of their conclusion once you actually hear and understand their arguments. A true conclusion tends to have good arguments for it, and good arguments tend to become known by well-credentialed persons. But since the causal dependency from “truth” to “credentials” is indirect, you can remove the dependency by simply conditioning on the intermediate (“argument quality”).

I think this is a special case of a more general principle: Argument screens off identity. Consider the following hypothetical exchange.

Person 1: I don’t support affirmative action. Here are some arguments why.

(…)

Person 1: So those are the main reasons for my position.

Person 2: Hmm, while I don’t see anything wrong with your arguments, I think you’re forgetting that you’re not a member of <insert protected identity group>. This invalidates any arguments you make.

Person 1: Um, I don’t think that’s how that works…

Person 2: No, seriously. If you’re not black, Hispanic, Native American, or any other minority group that is affected by affirmative action, then you have no right to be talking about it. You don’t have any credibility on the issue unless you’ve lived it.

Yes, this is a bit of a straw man, but it’s not actually so far from real arguments that I’ve heard before. I don’t know of an exact name for the logical fallacy Person 2 is making here (it’s not exactly an ad hominem), but I think that it is covered nicely by the slogan that is the title of this post.

A white person who has spent lots of time carefully researching and thinking through issues of racial inequality may well have more to say on these issues than a black person who has not done so. A man who has studied the nature of sexism in our society and the social norms surrounding sexual harassment may know more than a woman who has not done so.

More to the point, when somebody presents an argument, the response “Yeah, well you aren’t a member of this specific identity group that your argument refers to, so your argument is not credible” is virtually always a distraction and hinders progress.

More formally, my claim is that in most contexts:

Pr(conclusion | argument, identity) = Pr(conclusion | argument)

While your identity might inform the types of arguments you are able to make, predisposed to make, or willing to make, the arguments themselves are the only pieces of evidence we have in evaluating your conclusion.