On philosophical progress

A big question in the philosophy of philosophy is whether philosophers make progress over time. One relevant piece of evidence that gets brought up in these discussions is the lack of consensus on age old questions like free will, normative ethics, and the mind body problem. If a discipline is progressing steadily towards truth with time, the argument goes, then we should expect that questions that have been discussed for thousands of years should be more or less settled by now. After all, that is what we see in the hard sciences; there are no lingering disputes over the validity of vitalism or the realm of applicability of Newtonian mechanics.

There are a few immediate responses to this line of argument. It might be that the age old questions of philosophy are simply harder than the questions that get addressed by physicists or biologists. “Harder” doesn’t mean “requires more advanced mathematics to grapple with” here, but something more like “it’s unclear what even would count as a conclusive argument for one position or another, and therefore much less clear how to go about building consensus.” Try to imagine what sort of argument would convince you of the nonexistence of libertarian free will with the same sort of finality as a demonstration of time dilation convinces you of the inadequacy of nonrelativistic mechanics.

A possible rejoinder at this point would be to take after the logical positivists and deny the meaningfulness or at least truth-aptness of the big questions of philosophy as a whole. This may go too far; it may well be that a query is meaningful but, due to certain epistemic limitations of ours, forever beyond our ability to decide. (We know for sure that such queries can exist, due to Gödelian discoveries in mathematics. For instance, we know of the existence of a series of numbers that are perfectly well defined, but for which no algorithm can exist to enumerate all of them. The later numbers in this sequence will forever be a mystery to us, and not for lack of meaningfulness.)

I think that the roughly correct position to take is that science is largely about examining empirical facts-of-the-matter, whereas philosophy is largely about analyzing and refining our conceptual framework. While we have a fairly clear set of standards for how to update theories about the empirical world, we are not in possession of such a set of standards for evaluating different conceptual frameworks. The question of “what really are the laws governing the behavior of stuff out there” has much clearer truth conditions than a question like “what is the best way to think about the concepts of right and wrong”; i.e. It’s clearer what counts as a good answer and what counts as a bad answer.

When we’re trying to refine our concepts, we are taking into account our pre-theoretical intuitions (e.g. any good theory of the concept of justice must have something to do with our basic intuitive conception of justice). But we’re not just satisfied to describe the concept solely as the messy inconsistent bundle of intuitions that constitute our starting position on it. We also aim to describe the concept simply, by developing a “theory of justice” that relies on a small set of axioms and from which (the hope is) the rest of our conclusions about justice follow. We want our elaboration of the concept to be consistent, in that we shouldn’t simultaneously affirm that A is an instance of the concept and that A is not an instance of the concept. Often we also want our theory to be precise, even when the concept itself has vague boundaries.

Maybe there are other standards besides these, intuitiveness, simplicity, consistency, and precision. And the application of these standards is very rarely made explicit. But one thing that’s certain is that different philosophers have different mixes of these values. One philosopher might value simplicity more or less than another, and it’s not clear that one of them is doing something wrong by having different standards. Put another way, I’m not convinced that there is one unique right set of standards for conceptual refinement.

We may want to be subjectivists to some degree about philosophy, and say that there are a range of rationally permissible standards for conceptual refinement, none better than any other. This would have the result that on some philosophical questions, multiple distinct answers may be acceptable but some crazy enough answers are not. Maybe compatibilism and nihilism are acceptable stances on free will but libertarianism is not. Maybe dualism and physicalism are okay but not epiphenomenalism. And so on.

This view allows for a certain type of philosophical progress, namely the gradual ruling out of some philosophical positions as TOO weird. It also allows for formation of consensus, through the discovery of philosophical positions that are the best according to all or most of the admissible sets of standards. I think that one example of this would be the relatively recent rise of Bayesian epistemology in philosophy of science, and in particular the Bayesian view of scientific evidence as being quantified by the Bayes factor. In brief, what does it mean to say that an observation O gives evidence for a hypothesis H? The Bayesian not only has an answer to this, but to the more detailed question of to what degree O gives evidence for H. The quantity is cr(O | H) / cr(O), where cr(.) is a credence function encoding somebody’s beliefs before observing O. If this quantity is equal to 1, then O is no evidence for H. If it is greater than 1, then O is evidence for H. And if it’s less than 1, then O is evidence against H.

Not everything in Bayesian epistemology is perfectly uncontroversial, but I would argue that on this particular issue – the issue of how to best formalize the notion of scientific evidence – the Bayesian definition survives all its challenges unscathed. What are some other philosophical questions on which you think there has been definite progress?

A closer look at anthropic tests for consciousness

(This post is the culmination of my last week of posts on anthropics and conservation of expected evidence.)

In this post, I described how anthropic reasoning can apparently give you a way to update on theories of consciousness. This is already weird enough, but I want to make things a little weirder. I want to present an argument that in fact anthropic reasoning implies that we should be functionalists about consciousness.

But first, a brief recap (for more details see the post linked above):

Screen Shot 2018-08-09 at 9.09.08 AM

Thus…

Screen Shot 2018-08-09 at 9.15.37 AM.pngScreen Shot 2018-08-09 at 9.19.18 AM

Whenever this experiment is run, roughly 90% of experimental subjects observe snake eyes, and roughly 10% observe not snake eyes. What this means is that 90% of the people update in favor of functionalism (by a factor of 9), and only 10% of people update in favor of substrate dependence theory (also by a factor of 9).

Now suppose that we have a large population that starts out completely agnostic on the question of functionalism vs. substrate dependence. That is, the prior ratio for each individual is 1:

Screen Shot 2018-08-09 at 9.28.15 AM

Now imagine that we run arbitrarily many dice-killer experimental setups on the population. We would see an upwards drift in the average beliefs of the population towards functionalism. And in the limit of infinite experiments, we would see complete convergence towards functionalism as the correct theory of consciousness.

Now, the only remaining ingredient is what I’ve been going on about the past two days: if you can predict beforehand that a piece of evidence is going to make you on average more functionalist, then you should preemptively update in favor of functionalism.

What we end up with is the conclusion that considering the counterfactual infinity of experimental results we could receive, we should conclude with arbitrarily high confidence that functionalism is correct.

To be clear, the argument is the following:

  1. If we were to be members of a population that underwent arbitrarily many dice-killer trials, we would converge towards functionalism.
  2. Conservation of expected evidence: if you can predict beforehand which direction some observation would move you, then you should pre-emptively adjust your beliefs in that direction.
  3. Thus, we should preemptively converge towards functionalism.

Premise 1 follows from a basic application of anthropic reasoning. We could deny it, but doing so amounts to denying the self-sampling assumption and ensuring that you will lose in anthropic games.

Premise 2 follows from the axioms of probability theory. It is more or less the statement that you should update your beliefs with evidence, even if this evidence is counterfactual information about the possible results of future experiments.

(If this sounds unintuitive to you at all, consider the following thought experiment: We have two theories of cosmology, one in which 99% of people live in Region A and 1% in Region B, and the other in which 1% live in Region A and 99% in Region B. We now ask where we expect to find ourselves. If we expect to find ourselves in Region A, then we must have higher credence in the first theory than the second. And if we initially did not have this higher credence, then considering the counterfactual question “Where would I find myself if I were to look at which region I am in?” should cause us to update in favor of the first theory.)

Altogether, this argument looks really bullet proof to me. And yet its conclusion seems very wrong.

Can we really conclude with arbitrarily high certainty that functionalism is correct by just going through this sort of armchair reasoning from possible experimental results that we will never do? Should we now be hardcore functionalists?

I’m not quite sure yet what the right way to think about this is. But here is one objection I’ve thought of.

We have only considered one possible version of the dice killer thought experiment (in which the experimenter starts off with 1 human, then chooses 1 human and 9 androids, then 1 human and 99 androids, and so on). In this version, observing snake eyes was evidence for functionalism over substrate dependence theory, which is what causes the population-wide drift towards functionalism.

We can ask, however, if we can construct a variant of the dice killer thought experiment in which snake eyes counts as evidence for substrate dependence theory over functionalism. If so, then we could construct an experimental setup that we can predict beforehand will end up with us converging with arbitrary certainty to substrate dependence theory!

Let’s see how this might be done. We’ll imagine the set of all variants on the thought experiment (that is, the set of all choices the dice killer could make about how many humans and androids to kidnap in each round.)

Screen Shot 2018-08-10 at 12.32.28 AM

For ease of notation, we’ll abbreviate functionalism and substrate dependence theory as F and S respectively.

Screen Shot 2018-08-10 at 12.32.57 AM

And we’ll also introduce a convenient notation for calculating the total number of humans and the total number androids ever kidnapped by round N.

Screen Shot 2018-08-10 at 12.33.41 AM

Now, we want to calculate the probability of snake eyes given functionalism in this general setup, and compare it to the probability of snake eyes given substrate dependence theory. The first step will be to consider the probability of snake eyes if  the experiment happens to end on the nth round, for some n. This is just the number of individuals in the last round divided by the total number of kidnapped individuals.

Screen Shot 2018-08-10 at 12.35.06 AM

Now, we calculate the average probability of snake eyes (the average fraction of individuals in the last round).

Screen Shot 2018-08-10 at 12.36.08 AM

The question is thus if we can find a pair of sequences

Screen Shot 2018-08-10 at 12.41.24 AM

such that the first term is larger than the second.

Screen Shot 2018-08-10 at 12.45.29 AM.png

It seems hard to imagine that there are no such pairs of sequences that satisfy this inequality, but thus far I haven’t been able to find an example. For now, I’ll leave it as an exercise for the reader!

If there are no such pairs of sequences, then it is tempting to take this as extremely strong evidence for functionalism. But I am concerned about this whole line of reasoning. What if there are a few such pairs of sequences? What if there are far more in which functionalism is favored than those in which substrate dependence is favored? What if there are an infinity of each?

While I buy each step of the argument, it seems wrong to say that the right thing to do is to consider the infinite set of all possible anthropic experiments you could do, and then somehow average over the results of each to determine the direction in which we should update our theories of consciousness. Indeed, I suspect that any such averaging procedure would be vulnerable to arbitrariness in the way that the experiments are framed, such that different framings give different results.

At this point, I’m pretty convinced that I’m making some fundamental mistake here, but I’m not sure exactly where this mistake is. Any help from readers would be greatly appreciated. 🙂

Getting empirical evidence for different theories of consciousness

Previously, I described a thought experiment in which a madman kidnaps a person, then determines whether or not to kill them by rolling a pair of dice. If they both land 1 (snake eyes), then the madman kills the person. Otherwise, the madman lets them go and kidnaps ten new people. He rolls the dice again and if he gets snake eyes, kills all ten. Otherwise he lets them go and finds 100 new people. Et cetera until he eventually gets snake eyes, at which point he kills all the currently kidnapped people and stops his spree.

If you find that you have been kidnapped, then your chance of survival depends upon the dice landing snake eyes, which happens with probability 1/36. But we can also calculate the average fraction of people kidnapped that end up dying. We get the following:

Screen Shot 2018-08-02 at 1.16.15 AM

We already talked about how this is unusually high compared to the 1/36 chance of the dice landing snake eyes, and how to make sense of the difference here.

In this post, we’ll talk about a much stranger implication. To get there, we’ll start by considering a variant of the initial thought experiment. This will be a little weird, but there’s a nice payout at the end, so stick with it.

In our variant, our madman kidnaps not only people, but also rocks. (The kidnapper does not “rock”, he kidnaps pieces of stones). He starts out by kidnapping a person, then rolls his dice. Just like before, if he gets snake eyes, he kills the person. And if not, he frees the person and kidnaps a new group. This new group consists of 1 person and 9 rocks. Now if the dice come up snake eyes, the person is killed and the 9 rocks pulverized. And if not, they are all released, and 1 new person and 99 rocks are gathered.

To be clear, the pattern is:

First Round: 1 person
Second Round: 1 person, 9 rocks
Third Round: 1 person, 99 rocks
Fourth Round: 1 person, 999 rocks
and so on…

Now, we can run the same sort of anthropic calculation as before:

Screen Shot 2018-08-02 at 1.16.33 AM.png

Evidently, this time you have roughly a 10% chance of dying if you find yourself kidnapped! (Notice that this is still worse than 1/36, though a lot better than 90%).

Okay, so we have two scenarios, one in which 90% of those kidnapped die and the other in which 10% of those kidnapped die.

Now let’s make a new variant on our thought experiment, and set it in a fictional universe of my creation.

In this world there exist androids – robotic intelligences that behave, look, and feel like any ordinary human. They are so well integrated into society that most people don’t actually know if they are a biological person or an android. The primary distinction between the two groups is, of course, that one has a brain made of silicon transistors and the other has a brain made of carbon-based neurons.

There is a question of considerable philosophical and practical importance in this world, which is: Are androids conscious just like human beings? This question has historically been a source of great strife in this world. On the one hand, some biological humans argue that the substrate is essential to the existence of consciousness and that therefore non-carbon-based life forms can never be conscious, no matter how well they emulate conscious beings. This thesis is known as the substrate-dependence view.

On the other hand, many argue that we have no good reason to dismiss the androids’ potential consciousness. After all, they are completely indistinguishable from biological humans, and have the same capacity to introspect and report on their feelings and experiences. Some android philosophers even have heated debates about consciousness. Plus, the internal organization of androids is pretty much identical to that of biological humans, indicating that the same sort of computation is going on in both organisms. It is argued that clearly consciousness arises from the patterns of computation in a system, and that on that basis androids are definitely conscious. The people that support this position are called functionalists (and, no great surprise, all androids that are aware that they are androids are functionalists).

The fundamental difference between the two stances can be summarized easily: Substrate-dependence theorists think that to be conscious, you must be a carbon-based life form operating on cells. Functionalists think that to be conscious, you must be running a particular type of computation, regardless of what material that computation is running on

In this world, the debate runs on endlessly. The two sides marshal philosophical arguments to support their positions and hurl them at each other with little to no effect. Androids insist vehemently that they are as conscious as anybody else, functionalists say “See?? Look at how obviously conscious they are,” and substrate-dependence theorists say “But this is exactly what you’d expect to hear from an unconscious replica of a human being! Just because you built a machine that can cleverly perform the actions of conscious beings does not mean that it really is conscious”.

It is soon argued by some that this debate can never be settled. This camp, known as the mysterians, says that there is something fundamentally special and intrinsically mysterious about the phenomenon that bars us from ever being able to answer these types of question, or even provide evidence for them. They point to the subjective nature of experience and the fact that you can only really know whether somebody is conscious by entering their head, which is impossible. The mysterians’ arguments are convincing to many, and their following grows stronger by the day as the debates between the other parties appear ever more futile.

With this heated debate in the backdrop, we can now introduce a new variant on the dice killer setup.

The killer starts like before by kidnapping a single human (not an android). If he rolls snake eyes, this person is killed. If not, he releases them and kidnaps one new human and nine androids. (Sounding familiar?)  If he rolls snake eyes, all ten are killed, and if not, one new person and 99 new androids are kidnapped. Etc. Thus we have:

First Round: 1 person
Second Round: 1 person, 9 androids
Third Round: 1 person, 99 androids
Fourth Round: 1 person, 999 androids
and so on…

You live in this society, and are one of its many citizens that doesn’t know if they are an android or a biological human. You find yourself kidnapped by the killer. How worried should you be about your survival?

If you are a substrate dependence theorist, you will see this case as similar to the variant with rocks. After all, you know that you are conscious. So you naturally conclude that you can’t be an android. This means that there is only one possible person that you could be in each round. So the calculation runs exactly as it did before with the rocks, ending with a 10% chance of death.

If you are a functionalist, you will see this case as similar to the case we started with. You think that androids are conscious, so you don’t rule out any of the possibilities for who you might be. Thus you calculate as we did initially, ending with a 90% chance of death.

Here we pause to notice something very important! Our two different theories of consciousness have made different empirically verifiable predictions about the world! And not only are they easily testable, but they are significantly different. The amount of evidence provided by the observation of snake eyes has to do with the likelihood ratio P(snake eyes | functionalism) / P(snake eyes | substrate dependence). This ratio is roughly 90% / 10% = 9, which means that observing snake eyes tilts the balance by a factor of 9 in favor of functionalism.

More precisely, we use the likelihood ratio to update our prior credences in functionalism and substrate dependence to our posterior credences. That is,

Screen Shot 2018-08-02 at 1.27.02 AM.png

This is a significant update. It can be made even more significant by altering the details of the setup. But the most important point is that there is an update at all. If what I’ve argued is correct, then the mysterians are demonstrably wrong. We can construct setups that test theories of consciousness, and we know just how!

(There’s an interesting caveat here, which is that this is only evidence for the individual that found themselves to be kidnapped. If an experimenter was watching from the outside and saw the dice land snake eyes, they would get no evidence for functionalism over substrate dependence. This relates to the anthropic nature of the evidence; it is only evidence for the individuals for whom the indexical claims “I have been kidnapped” and “I am conscious” apply.)

So there we have it. We’ve constructed an experimental setup that allows us to test claims of consciousness that are typically agreed to be beyond empirical verification. Granted, this is a pretty destructive setup and would be monstrously unethical to actually enact. But the essential features of the setup can be preserved without the carnage. Rather than snake eyes resulting in the killer murdering everybody kept captive, it could just result in the experimenter saying “Huzzah!” and ending the experiment. Then the key empirical evidence for somebody that has been captured would be whether or not the experimenter says “Huzzah!” If so, then functionalism becomes nine times more likely than it was before relative to substrate dependence.

This would be a perfectly good experiment that we could easily run, if only we could start producing some androids indistinguishable from humans. So let’s get to it, AI researchers!

What do I find conceptually puzzling?

There are lots of things that I don’t know, like, say, what the birth rate in Sweden is or what the effect of poverty on IQ is. There are also lots of things that I find really confusing and hard to understand, like quantum field theory and monetary policy. There’s also a special category of things that I find conceptually puzzling. These things aren’t difficult to grasp because the facts about them are difficult to understand or require learning complicated jargon. Instead, they’re difficult to grasp because I suspect that I’m confused about the concepts in use.

This is a much deeper level of confusion. It can’t be adjudicated by just reading lots of facts about the subject matter. It requires philosophical reflection on the nature of these concepts, which can sometimes leave me totally confused about everything and grasping for the solid ground of mere factual ignorance.

As such, it feels like a big deal when something I’ve been conceptually puzzled about becomes clear. I want to compile a list for future reference of things that I’m currently conceptually puzzled about and things that I’ve become un-puzzled about. (This is not a complete list, but I believe it touches on the major themes.)

Things I’m conceptually puzzled about

What is the relationship between consciousness and physics?

I’ve written about this here.

Essentially, at this point every available viewpoint on consciousness seems wrong to me.

Eliminativism amounts to a denial of pretty much the only thing that we can be sure can’t be denied – that we are having conscious experiences. Physicalism entails the claim that facts about conscious experience can be derived from laws of physics, which is wrong as a matter of logic.

Dualism entails that the laws of physics by themselves cannot account for the behavior of the matter in our brains, which is wrong. And epiphenomenalism entails that our beliefs about our own conscious experience are almost certainly wrong, and are no better representations of our actual conscious experiences than random chance.

How do we make sense of decision theory if we deny libertarian free will?

Written about this here and here.

Decision theory is ultimately about finding the decision D that maximizes expected utility EU(D). But to do this calculation, we have to decide what the set of possible decisions we are searching is.

EU confusion

Make this set too large, and you end up getting fantastical and impossible results (like that the optimal decision is to snap your fingers and make the world into a utopia). Make it too small, and you end up getting underwhelming results (in the extreme case, you just get that the optimal decision is to do exactly what you are going to do, since this is the only thing you can do in a strictly deterministic world).

We want to find a nice middle ground between these two – a boundary where we can say “inside here the things that are actually possible for us to do, and outside are those that are not.” But any principled distinction between what’s in the set and what’s not must be based on some conception of some actions being “truly possible” to us, and others being truly impossible. I don’t know how to make this distinction in the absence of a robust conception of libertarian free will.

Are there objectively right choices of priors?

I’ve written about this here.

If you say no, then there are no objectively right answers to questions like “What should I believe given the evidence I have?” And if you say yes, then you have to deal with thought experiments like the cube problem, where any choice of priors looks arbitrary and unjustifiable.

(If you are going to be handed a cube, and all you know is that it has a volume less than 1 cm3, then setting maximum entropy priors over volumes gives different answers than setting maximum entropy priors over side areas or side lengths. This means that what qualifies as “maximally uncertain” depends on whether we frame our reasoning in terms of side length, areas, or cube volume. Other approaches besides MaxEnt have similar problems of concept dependence.)

How should we deal with infinities in decision theory?

I wrote about this here, here, here, and here.

The basic problem is that expected utility theory does great at delivering reasonable answers when the rewards are finite, but becomes wacky when the rewards become infinite. There are a huge amount of examples of this. For instance, in the St. Petersburg paradox, you are given the option to play a game with an infinite expected payout, suggesting that you should buy in to the game no matter how high the cost. You end up making obviously irrational choices, such as spending $1,000,000 on the hope that a fair coin will land heads 20 times in a row. Variants of this involve the inability of EU theory to distinguish between obviously better and worse bets that have infinite expected value.

And Pascal’s mugging is an even worse case. Roughly speaking, a person comes up to you and threatens you with infinite torture if you don’t submit to them and give them 20 dollars. Now, the probability that this threat is credible is surely tiny. But it is non-zero! (as long as you don’t think it is literally logically impossible for this threat to come true)

An infinite penalty times a finite probability is still an infinite expected penalty. So we stand to gain an infinite expected utility by just handing over the 20 dollars. This seems ridiculous, but I don’t know any reasonable formalization of decision theory that allows me to refute it.

Is causality fundamental?

Causality has been nicely formalized by Pearl’s probabilistic graphical models. This is a simple extension of probability theory, out of which naturally falls causality and counterfactuals.

One can use this framework to represent the states of fundamental particles and how they change over time and interact with one another. What I’m confused about is that in some ways of looking at it, the causal relations appear to be useful but un-fundamental constructs for the sake of easing calculations. In other ways of looking at it, causal relations are necessarily built into the structure of the world, and we can go out and empirically discover them. I don’t know which is right. (Sorry for the vagueness in this one – it’s confusing enough to me that I have trouble even precisely phrasing the dilemma).

How should we deal with the apparent dependence of inductive reasoning upon our choices of concepts?

I’ve written about this here. Beyond just the problem of concept-dependence in our choices of priors, there’s also the problem presented by the grue/bleen thought experiment.

This thought experiment proposes two new concepts: grue (= the set of things that are either green before 2100 or blue after 2100) and bleen (the inverse of grue). It then shows that if we reasoned in terms of grue and bleen, standard induction would have us concluding that all emeralds will suddenly turn blue after 2100. (We repeatedly observed them being grue before 2100, so we should conclude that they will be grue after 2100.)

In other words, choose the wrong concepts and induction breaks down. This is really disturbing – choices of concepts should be merely pragmatic matters! They shouldn’t function as fatal epistemic handicaps. And given that they appear to, we need to develop some criterion we can use to determine what concepts are good and what concepts are bad.

The trouble with this is that the only proposals I’ve seen for such a criterion reference the idea of concepts that “carve reality at its joints”; in other words, the world is composed of green and blue things, not grue and bleen things, so we should use the former rather than the latter. But this relies on the outcome of our inductive process to draw conclusions about the starting step on which this outcome depends!

I don’t know how to cash out “good choices of concepts” without ultimately reasoning circularly. I also don’t even know how to make sense of the idea of concepts being better or worse for more than merely pragmatic reasons.

How should we reason about self defeating beliefs?

The classic self-defeating belief is “This statement is a lie.” If you believe it, then you are compelled to disbelieve it, eliminating the need to believe it in the first place. Broadly speaking, self-defeating beliefs are those that undermine the justifications for belief in them.

Here’s an example that might actually apply in the real world: Black holes glow. The process of emission is known as Hawking radiation. In principle, any configuration of particles with a mass less than the black hole can be emitted from it. Larger configurations are less likely to be emitted, but even configurations such as a human brain have a non-zero probability of being emitted. Henceforth, we will call such configurations black hole brains.

Now, imagine discovering some cosmological evidence that the era in which life can naturally arise on planets circling stars is finite, and that after this era there will be an infinite stretch of time during which all that exists are black holes and their radiation. In such a universe, the expected number of black hole brains produced is infinite (a tiny finite probability multiplied by an infinite stretch of time), while the expected number of “ordinary” brains produced is finite (assuming a finite spatial extent as well).

What this means is that discovering this cosmological evidence should give you an extremely strong boost in credence that you are a black hole brain. (Simply because most brains in your exact situation are black hole brains.) But most black hole brains have completely unreliable beliefs about their environment! They are produced by a stochastic process which cares nothing for producing brains with reliable beliefs. So if you believe that you are a black hole brain, then you should suddenly doubt all of your experiences and beliefs. In particular, you have no reason to think that the cosmological evidence you received was veridical at all!

I don’t know how to deal with this. It seems perfectly possible to find evidence for a scenario that suggests that we are black hole brains (I’d say that we have already found such evidence, multiple times). But then it seems we have no way to rationally respond to this evidence! In fact, if we do a naive application of Bayes’ theorem here, we find that the probability of receiving any evidence in support of black hole brains to be 0!

So we have a few options. First, we could rule out any possible skeptical scenarios like black hole brains, as well as anything that could provide any amount of evidence for them (no matter how tiny). Or we could accept the possibility of such scenarios but face paralysis upon actually encountering evidence for them! Both of these seem clearly wrong, but I don’t know what else to do.

How should we reason about our own existence and indexical statements in general?

This is called anthropic reasoning. I haven’t written about it on this blog, but expect future posts on it.

A thought experiment: imagine a murderous psychopath who has decided to go on an unusual rampage. He will start by abducting one random person. He rolls a pair of dice, and kills the person if they land snake eyes (1, 1). If not, he lets them free and hunts down ten new people. Once again, he rolls his pair of die. If he gets snake eyes he kills all ten. Otherwise he frees them and kidnaps 100 new people. On and on until he eventually gets snake eyes, at which point his murder spree ends.

Now, you wake up and find that you have been abducted. You don’t know how many others have been abducted alongside you. The murderer is about to roll the dice. What is your chance of survival?

Your first thought might be that your chance of death is just the chance of both dice landing 1: 1/36. But think instead about the proportion of all people that are ever abducted by him that end up dying. This value ends up being roughly 90%! So once you condition upon the information that you have been captured, you end up being much more worried about your survival chance.

But at the same time, it seems really wrong to be watching the two dice tumble and internally thinking that there is a 90% chance that they land snake eyes. It’s as if you’re imagining that there’s some weird anthropic “force” pushing the dice towards snake eyes. There’s way more to say about this, but I’ll leave it for future posts.

Things I’ve become un-puzzled about

Newcomb’s problem – one box or two box?

To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly.

– Nozick, 1969

I’ve spent months and months being hopelessly puzzled about Newcomb’s problem. I now am convinced that there’s an unambiguous right answer, which is to take the one box. I wrote up a dialogue here explaining the justification for this choice.

In a few words, you should one-box because one-boxing makes it nearly certain that the simulation of you run by the predictor also one-boxed, thus making it nearly certain that you will get 1 million dollars. The dependence between your action and the simulation is not an ordinary causal dependence, nor even a spurious correlation – it is a logical dependence arising from the shared input-output structure. It is the same type of dependence that exists in the clone prisoner dilemma, where you can defect or cooperate with an individual you are assured is identical to you in every single way. When you take into account this logical dependence (also called subjunctive dependence), the answer is unambiguous: one-boxing is the way to go.

Summing up:

Things I remain conceptually confused about:

  • Consciousness
  • Decision theory & free will
  • Objective priors
  • Infinities in decision theory
  • Fundamentality of causality
  • Dependence of induction on concept choice
  • Self-defeating beliefs
  • Anthropic reasoning

Summarizing conscious experience

There’s a puzzle for implementation of probabilistic reasoning in human beings. This is that the start of the reasoning process in humans is conscious experience, and it’s not totally clear how we should update on conscious experiences.

Jeffreys defined a summary of an experience E as a set B of propositions {B1, B2, … Bn} such that for all other propositions in your belief system A, P(A | B) = P(A | B, E).

In other words, B is a minimal set of propositions that fully screens off your experience.

This is a useful concept because summary sentences allow you to isolate everything that is epistemically relevant about conscious experience. if you have a summary B of an experience E, then you only need to know P(A | B) and P(B | E) in order to calculate P(A | E).

Notice that the summary set is subjective; it is defined only in terms of properties of your personal belief network. The set of facts that screens off E for you might be different from the set of facts that screens it off for somebody else.

Quick example.

Consider a brief impression by candlelight of a cloth held some distance away from you. Call this experience E.

Suppose that all you could decipher from E is that the cloth was around 2 meters away from you, and that it was either blue (with probability 60%) or green (with probability 40%). Then the summary set for E might be {“The cloth is blue”, “The cloth is green”, “The cloth is 2 meters away from you”, “The cloth is 3 meters away from you”, etc.}.

If this is the right summary set, then the probabilities P(“The cloth is blue”), P(“The cloth is green”) and P(“The cloth is x meters away from you”) should screen off E from the rest of your beliefs.

One trouble is that it’s not exactly obvious how to go about converting a given experience into a set of summary propositions. We could always be leaving something out. For instance, one more thing we learned upon observing E was the proposition “I can see light.” This is certainly not screened off by the other propositions so far, so we need to add it in as well.

But how do we know that we’ve gotten everything now? If we think a little more, we realize that we have also learned something about the nature of the light given off by the candle flame. We learn that it is capable of reflecting the color of light that we saw!

But now this additional consideration is related to how we interpret the color of the cloth. In other words, not only might we be missing something from our summary set, but that missing piece might be relevant to how we interpret the others.

I’d like to think more about this question: In general, how do we determine the set of propositions that screens off a given experience from the rest of your beliefs? Ultimately, to be able to coherently assess the impact of experiences on your web of beliefs, your model of reality must contain a model of yourself as an experiencer.

The nature of this model is pretty interesting from a philosophical perspective. Does it arise organically out of factual beliefs about the physical world? Well, this is what a physicalist would say. To me, it seems quite plausible that modeling yourself as a conscious experiencer would require a separate set of rules relating physical happenings to conscious experiences. How we should model this set of rules as a set of a priori hypotheses to be updated on seems very unclear to me.

Constructing the world

In this six and a half hour lecture series by David Chalmers, he describes the concept of a minimal set of statements from which all other truths are a priori “scrutable” (meaning, basically, in-principle knowable or derivable).

What are the types of statements in this minimal set required to construct the world? Chalmers offers up four categories, and abbreviates this theory PQIT.

P

P is the set of physical facts (for instance, everything that would be accessible to a Laplacean demon). It can be thought of as essentially the initial conditions of the universe and the laws governing their changes over time.

Q

Q is the set of facts about qualitative experience. We can see Chalmers’ rejection of physicalism here, as he doesn’t think that Q is eclipsed within P. Example of a type of statement that cannot be derived from P without Q: “There is a beige region in the bottom right of my visual field.”

I

Here’s a true statement: “I’m in the United States.” Could this be derivable from P and Q? Presumably not; we need another set of indexical truths that allows us to have “self-locating” beliefs and to engage in anthropic reasoning.

T

Suppose that P, Q, and I really are able to capture all the true statements there are to be captured. Well then, the statement “P, Q, and I really are able to capture all the true statements there are to be captured” is a true statement, and it is presumably not captured by P, Q, and I! In other words, we need some final negative statements that tell us that what we have is enough, and that there are no more truths out there. These “that’s all”-type statements are put into the set T.

⁂⁂⁂

So this is a basic sketch of Chalmer’s construction. I like that we can use these tags like PQIT or PT or QIT as a sort of philosophical zip-code indicating the core features of a person’s philosophical worldview. I also want to think about developing this further. What other possible types of statements are there out there that may not be captured in PQIT? Here is a suggestion for a more complete taxonomy:

p    microphysics
P    macrophysics (by which I mean all of science besides fundamental physics)
Q    consciousness
R    normative rationality
E    
normative ethics
C    counterfactuals
L    
mathematical / logical truths
I     indexicals
T    “that’s-all” statements

I’ve split P into big-P (macrophysics) and little-p (microphysics) to account for the disagreements about emergence and reductionism. Normativity here is broad enough to include both normative epistemic statements (e.g. “You should increase your credence in the next coin toss landing H after observing it land H one hundred times in a row”) and ethical statements. The others are fairly self-explanatory.

The most ontologically extravagant philosophical worldview would then be characterized as pPQRECLIT.

My philosophical address is pRLIT (with the addendum that I think C comes from p, and am really confused about Q). What’s yours?

Moving Naturalism Forward: Eliminating the macroscopic

Sean Carroll, one of my favorite physicists and armchair philosophers, hosted a fantastic conference on philosophical naturalism and science, and did the world a great favor by recording the whole thing and posting it online. It was a three-day long discussion on topics like the nature of reality, emergence, morality, free will, meaning, and consciousness. Here are the videos for the first two discussion sections, and the rest can be found by following Youtube links.

 

Having watched through the entire thing, I have updated a few of my beliefs, plan to rework some of my conceptual schema, and am puzzled about a few things.

A few of my reflections and take-aways:

  1. I am much more convinced than before that there is a good case to be made for compatibilism about free will.
  2. I think there is a set of interesting and challenging issues around the concept of representation and intentionality (about-ness) that I need to look into.
  3. I am more comfortable with intense reductionism claims, like “All fact about the macroscopic world are entailed by the fundamental laws of physics.”
  4. I am really interested in hearing Dan Dennett talk more about grounding morality, because what he said was starting to make a lot of sense to me.
  5. I am confused about the majority attitude in the room that there’s not any really serious reason to take an eliminativist stance about macroscopic objects.
  6. I want to find more details about the argument that Simon DeDeo was making for the undecidability of questions about the relationship between macroscopic theories and microscopic theories (!!!).
  7. There’s a good way to express the distinction between the type of design human architects engage in and the type of design that natural selection produces, which is about foresight and representations of reasons. I’m not going to say more about this, and will just refer you to the videos.
  8. There are reasons to suspect that animal intelligence and capacity to suffer are inversely correlated (that is, the more intelligent an animal, the less capacity to suffer it likely has). This really flips some of our moral judgements on their head. (You must deliver a painful electric shock to either a human or to a bird. Which one will you choose?)

Let me say a little more about number 5.

I think that questions about whether macroscopic objects like chairs or plants really REALLY exist, or whether there are really only just fermions and bosons are ultimately just questions about how we should use the word “exist.” In the language of our common sense intuitions, obviously chairs exist, and if you claim otherwise, you’re just playing complicated semantic games. I get this argument, and I don’t want to be that person that clings to bizarre philosophical theses that rest on a strange choice of definitions.

But at the same time, I see a deep problem with relying on our commonsense intuitions about the existence of the macro world. This is that as soon as we start optimizing for consistency, even a teeny tiny bit, these macroscopic concepts fall to pieces.

For example, here is a trilemma (three statements that can’t all be correct):

  1. The thing I am sitting on is a chair.
  2. If you subtract a single atom from a chair, it is still a chair.
  3. Empty space is not a chair.

These seem to me to be some of the most obvious things we could say about chairs. And yet they are subtly incoherent!

Number 1 is really shorthand for something like “there are chairs.” And the reason why the second premise is correct is that denying it requires that there be a chair such that if you remove a single atom, it is no longer a chair. I take it to be obvious that such things don’t exist. But accepting the first two requires us to admit that as we keep shedding atoms from a chair, it stays a chair, even down to the very last atom. (By the way, some philosophers do actually deny number 2. They take a stance called epistemicism, which says that concepts like “chair” and “heap” are actually precise and unambiguous, and there exists a precise point at which a chair becomes a non-chair. This is the type of thing that makes me giggle nervously when reflecting on the adequacy of philosophy as a field.)

As I’ve pointed out in the past, these kinds of arguments can be applied to basically everything in the macroscopic world. They wreak havoc on our common sense intuitions and, to my mind, demand rejection of the entire macroscopic world. And of course, they don’t apply to the microscopic world. “If X is an electron, and you change its electric charge a tiny bit, is it still an electron?” No! Electrons are physical substances with precise and well-defined properties, and if something doesn’t have these properties, it is not an electron! So the Standard Model is safe from this class of arguments.

Anyway, this is all just to make the case that upon close examination, our commonsense intuitions about the macroscopic world turn out to be subtly incoherent. What this means is that we can’t make true statements like “There are two cars in the garage”. Why? Just start removing atoms from the cars until you get to a completely empty garage. Since no single-atom change can make the relevant difference to “car-ness”, at each stage, you’ll still have two cars!

As soon as you start taking these macroscopic concepts seriously, you find yourself stuck in a ditch. This, to me, is an incredibly powerful argument for eliminativism, and I was surprised to find that arguments like these weren’t stressed at the conference. This makes me wonder if this argument is as powerful as I think.

What is integrated information?

Integrated information theory relates consciousness to degrees of integrated information within a physical system. I recently became interested in IIT and found it surprisingly hard to locate a good simple explanation of the actual mathematics of integrated information online.

Having eventually just read through all of the original papers introducing IIT, I discovered that integrated information is closely related to some of my favorite bits of mathematics, involving information theory and causal modeling.  This was exciting enough to me that I decided to write a guide to understanding integrated information. My goal in this post is to introduce a beginner to integrated information in a rigorous and (hopefully!) intuitive way.

I’ll describe it increasing levels of complexity, so that even if you eventually get lost somewhere in the post, you’ll be able to walk away having learned something. If you get to the end of this post, you should be able to sit down with a pencil and paper and calculate the amount of integrated information in simple systems, as well as how to calculate it in principle for any system.

Level 1

So first, integrated information is a measure of the degree to which the components of a system are working together to produce outputs.

A system composed of many individual parts that are not interacting with each other in any way is completely un-integrated – it has an integrated information ɸ = 0. On the other hand, a system composed entirely of parts that are tightly entangled with one another will have a high amount of integrated information, ɸ >> 0.

For example, consider a simple model of a camera sensor.

tut_sensors_grid2

This sensor is composed of many independent parts functioning completely separately. Each pixel stores a unit of information about the outside world, regardless of what its neighboring pixels are doing. If we were to somehow sever the causal connections between the two halves of the sensor, each half would still capture and store information in exactly the same way.

Now compare this to a human brain.

FLARE-Technique-Offers-Snapshots-of-Neuron-Activity

The nervous system is a highly entangled mesh of neurons, each interacting with many many neighbors in functionally important ways. If we tried to cut the brain in half, severing all the causal connections between the two sides, we would get an enormous change in brain functioning.

Makes sense? Okay, on to level 2.

Level 2

So, integrated information has to do with the degree to which the components of a system are working together to produce outputs. Let’s delve a little deeper.

We just said that we can tell that the brain is integrating lots of information, because the functioning would be drastically disrupted if you cut it in half. A keen reader might have realized that the degree to which the functioning is disrupted will depend a lot on how you cut it in half.

For instance, cut off the front half of somebody’s brain, and you will end up with total dysfunction. But you can entirely remove somebody’s cerebellum (~50% of the brain’s neurons), and end up with a person that has difficulty with coordination and is a slow learner, but is otherwise a pretty ordinary person.

Human head, MRI and 3D CT scans

What this is really telling us is that different parts of the brain are integrating information differently. So how do we quantify the total integration of information of the brain? Which cut do we choose when evaluating the decrease in functioning?

Simple: We look at every possible way of partitioning the brain into two parts. For each one, we see how much the brain’s functioning is affected. Then we locate the minimum information partition, that is, the partition that results in the smallest change in brain functioning. The change in functioning that results from this particular partition is the integrated information!

Okay. Now, what exactly do we mean by “changes to the system’s functioning”? How do we measure this?

Answer: The functionality of a system is defined by the way in which the current state of the system constrains the past and future states of the system.

To make full technical sense of this, we have to dive a little deeper.

Level 3

How many possible states are there of a Connect Four board?

(I promise this is relevant)

The board is 6 by 7, and each spot can be either a red piece, a black piece, or empty.

Screen Shot 2018-04-20 at 1.03.04 AM

So a simple upper bound on the number of total possible board states is 342 (of course, the actual number of possible states will be much lower than this, since some positions are impossible to get into).

Now, consider what you know about the possible past and future states of the board if the board state is currently…

Screen Shot 2018-04-20 at 1.03.33 AM

Clearly there’s only one possible past state:

Screen Shot 2018-04-20 at 1.03.04 AM

And there are seven possible future states:

What this tells us is that the information about the current state of the board constrains the possible past and future states, selecting exactly one possible board out of the 342 possibilities for the past, and seven out of 342 possibilities for the future.

More generally, for any given system S we have a probability distribution over past and future states, given that the current state is X.

System

Pfuture(X, S) = Pr( Future state of S | Present state of S is X )
Ppast(X, S) = Pr( Past state of S | Present state of S is X )

For any partition of the system into two components, S1 and S2, we can consider the future and past distributions given that the states of the components are, respectively, X1 and X2, where X = (X1, X2).

System

Pfuture(X, S1, S2) = Pr( Future state of S1 | Present state of S1 is X1 )・Pr( Future state of S2 | Present state of S2 is X2 )
Ppast(X, S1, S2) = Pr( Past state of S1 | Present state of S1 is X1 )・Pr( Past state of S2 | Present state of S2 is X2 )

Now, we just need to compare our distributions before the partition to our distributions after the partition. For this we need some type of distance function D that assesses how far apart two probability distributions are. Then we define the cause information and the effect information for the partition (S1, S2).

Cause information = D( Ppast(X, S), Ppast(X, S1, S2) )
Effect information = D( Pfuture(X, S), Pfuture(X, S1, S2) )

In short, the cause information is how much the distribution over past states changes when you partition off your system into two separate systems And the future information is the change in the distribution over future states when you partition the system.

The cause-effect information CEI is then defined as the minimum of the cause information CI and effect information EI.

CEI = min{ CI, EI }

We’ve almost made it all the way to our full definition of ɸ! Our last step is to calculate the CEI for every possible partition of S into two pieces, and then select the partition that minimizes CEI (the minimum information partition MIP).

The integrated information is just the cause effect information of the minimum information partition!

ɸ = CEI(MIP)

Level 4

We’ve now semi-rigorously defined ɸ. But to really get a sense of how to calculate ɸ, we need to delve into causal diagrams. At this point, I’m going to assume familiarity with causal modeling. The basics are covered in a series of posts I wrote starting here.

Here’s a simple example system:

XOR AND.png

This diagram tells us that the system is composed of two variables, A and B. Each of these variables can take on the values 0 and 1. The system follows the following simple update rule:

A(t + 1) = A(t) XOR B(t)
B(t + 1) = A(t) AND B(t)

We can redraw this as a causal diagram from A and B at time 0 to A and B at time 1:

Causal Diagram

What this amounts to is the following system evolution rule:

    ABt → ABt+1
00        00
01       10
10       10
11       01

Now, suppose that we know that the system is currently in the state AB = 00. What does this tell us about the future and past states of the system?

Well, since the system evolution is deterministic, we can say with certainty that the next state of the system will be 00. And since there’s only one way to end up in the state 00, we know that the past state of the system 00.

We can plot the probability distributions over the past and future distributions as follows:

Probabilities Full System

This is not too interesting a distribution… no information is lost or gained going into the past or future. Now we partition the system:

XOR AND Cut

The causal diagram, when cut, looks like:

Causal Diagram Cut

Why do we have the two “noise” variables? Well, both A and B take two variables as inputs. Since one of these causal inputs has been cut off, we replace it with a random variable that’s equally likely to be a 0 or a 1. This procedure is called “noising” the causal connections across the partition.

According to this diagram, we now have two independent distributions over the two parts of the system, A and B. In addition, to know the total future state of a system, we do the following:

P(A1, B1 | A0, B0) = P(A1 | A0) P(B1 | B0)

We can compute the two distributions P(A1 | A0) and P(B1 | B0) straightforwardly, by looking at how each variable evolves in our new causal diagram.

A0 = 0 ⇒ A1 = 0, 1 (½ probability each)
B0 = 0 ⇒ B1 = 0

A0 = 0 ⇒ A-1 = 0, 1 (½ probability each)
B0 = 0 ⇒ B-1 = 0, 1 (probabilities ⅔ and ⅓)

This implies the following probability distribution for the partitioned system:

Partitioned System

I recommend you go through and calculate this for yourself. Everything follows from the updating rules that define the system and the noise assumption.

Good! Now we have two distributions, one for the full system and one for the partitioned system. How do we measure the difference between these distributions?

There are a few possible measures we could use. My favorite of these is the Kullback-Leibler divergence DKL. Technically, this metric is only used in IIT 2.0, not IIT 3.0 (which uses the earth-mover’s distance). I prefer DKL, as it has a nice interpretation as the amount of information lost when the system is partitioned. I have a post describing DKL here.

Here’s the definition of DKL:

DKL(P, Q) = ∑ Pi log(Pi / Qi)

We can use this quantity to calculate the cause information and the effect information:

Cause information = log(3) ≈ 1.6
Effect information = log(2) = 1

These values tell us that our partition destroys about .6 more bits of information about the past than it does the future. For the purpose of integrated information, we only care about the smaller of these two (for reasons that I don’t find entirely convincing).

Cause-effect information = min{ 1, 1.6 } = 1

Now, we’ve calculated the cause-effect information for this particular partition. And since our system has only two variables, this is the only possible partition.

The integrated information is the cause-effect information of the minimum information partition. Since our system only has two components, the partition we’ve examined is the only possible partition, meaning that it must be the minimum information partition. And thus, we’ve calculated ɸ for our system!

ɸ = 1

Level 5

Let’s now define ɸ in full generality.

Our system S consists of a vector of N variables X = (X1, X2, X3, …, XN), each an element in some space 𝒳. Our system also has an updating rule, which is a function f: 𝒳N → 𝒳N. In our previous example, 𝒳 = {0, 1}, N = 2, and f(x, y) = (x XOR y, x AND y).

More generally, our updating rule f can map X to a probability distribution p:  𝒳N → . We’ll denote P(Xt+1 | Xt) as the distribution over the possible future states, given the current state. P is defined by our updating rule: P(Xt+1 | Xt) = f(Xt). The distribution over possible past states will be denoted P(Xt-1 | Xt). We’ll obtain this using Bayes’ rule: P(Xt-1 | Xt) = P(Xt | Xt-1) P(Xt-1) / P(Xt) = f(Xt-1) P(Xt-1) / P(Xt).

A partition of the system is a subset of {1, 2, 3, …, N}, which we’ll label A. We define B = {1, 2, 3, …, N} \ A. Now we can define XA = ( X)a∈A, and XB = ( X)b∈B. Loosely speaking, we can say that X = (XA, XB), i.e. that the total state is just the combination of the two partitions A and B.

We now define the distributions over future and past states in our partitioned system:

Q(Xt+1 | Xt) = P(XA, t+1 | XA, t) P(XB, t+1 | XB, t)
Q(Xt-1 | Xt) = P(XA, t-1 | XA, t) P(XB, t-1 | XB, t).

The effect information EI of the partition defined by A is the distance between P(Xt+1 | Xt) and Q(Xt+1 | Xt), and the cause information CI is defined similarly. The cause-effect information is defined as the minimum of these two.

CI(f, A, Xt) = D( P(Xt-1 | Xt), Q(Xt-1 | Xt) )
EI(f, A, Xt) = D( P(Xt+1 | Xt), Q(Xt+1 | Xt) )

CEI(f, A, Xt) = min{ CI(f, A, Xt), EI(f, A, Xt) }

And finally, we define the minimum information partition (MIP) and the integrated information:

MIP = argminA CEI(f, A, Xt)
ɸ(f, Xt) = minA CEI(f, A, Xt)
= CEI(f, MIP, Xt)

And we’re done!

Notice that our final result is a function of f (the updating function) as well as the current state of the system. What this means is that the integrated information of a system can change from moment to moment, even if the organization of the system remains the same.

By itself, this is not enough for the purposes of integrated information theory. Integrated information theory uses ɸ to define gradations of consciousness of systems, but the relationship between ɸ and consciousness isn’t exactly one-to-on (briefly, consciousness resides in non-overlapping local maxima of integrated information).

But this post is really meant to just be about integrated information, and the connections to the theory of consciousness are actually less interesting to me. So for now I’ll stop here! 🙂

Utter confusion about consciousness

I’m starting to get a sense of why people like David Chalmers and Daniel Dennett call consciousness the most mysterious thing known to humans. I’m currently just really confused, and think that pretty much every position available with respect to consciousness is deeply unsatisfactory. In this post, I’ll just walk through my recent thinking.

Against physicalism

In a previous post, I imagined a scientist from the future who told you they had a perfected theory of consciousness, and asked how we could ask for evidence confirming this. This theory of consciousness could presumably be thought of as a complete mapping from physical states to conscious states – a set of psychophysical laws. Questions about the nature of consciousness are then questions about the nature of these laws. Are they ultimately the same kind of laws as chemical laws (derivable in principle from the underlying physics)? Or are they logically distinct laws that must be separately listed on the catalogue of the fundamental facts about the universe?

I take physicalism to be the stance that answers ‘yes’ to the first question and ‘no’ to the second. Dualism and epiphenomenalism answer ‘no’ to the first and ‘yes’ to the second, and are distinguished by the character of the causal relationships between the physical and the conscious entailed by the psychophysical laws.

So, is physicalism right? Imagining that we had a perfect mapping from physical states to conscious states, would this mapping be in principle derivable from the Schrodinger equation? I think the answer to this has to be no; whatever the psychophysical laws are, they are not going to be in principle derivable from physics.

To see why, let’s examine what it looks like when we derive macroscopic laws from microscopic laws. Luckily, we have a few case studies of successful reduction. For instance, you can start with just the Schrodinger equation and derive the structure of the periodic table. In other words, the structure and functioning of atoms and molecules naturally pops out when you solve the equation for systems of many particles.

You can extrapolate this further to larger scale systems. When we solve the Schrodinger equation for large systems of biomolecules, we get things like enzymes and cell membranes and RNA, and all of the structure and functioning corresponding to our laws of biology. And extending this further, we should expect that all of our behavior and talk about consciousness will be ultimately fully accounted for in terms of purely physical facts about the structure of our brain.

The problem is that consciousness is something more than just the words we say when talking about consciousness. While it’s correlated in very particular ways with our behavior (the structure and functioning of our bodies), it is by its very nature logically distinct from these. You can tell me all about the structure and functioning of a physical system, but the question of whether or not it is conscious is a further fact that is not logically entailed. The phrase LOGICALLY entailed is very important here – it may be that as a matter of fact, it is a contingent truth of our universe that conscious facts always correspond to specific physical facts. But this is certainly not a relationship of logical entailment, in the sense that the periodic table is logically entailed by quantum mechanics.

In summary, it looks like we have a problem on our hands if we want to try to derive facts about consciousness from facts about fundamental physics. Namely, the types of things we can derive from something like the Schrodinger equation are facts about complex macroscopic structure and functioning. This is all well and good for deriving chemistry or solid-state physics from quantum mechanics, as these fields are just collections of facts about structure and functioning. But consciousness is an intrinsic property that is logically distinct from properties like macroscopic structure and functioning. You simply cannot expect to start with the Schrodinger equation and naturally arrive at statements like “X is experiencing red” or “Y is feeling sad”, since these are not purely behavioral statements.

Here’s a concise rephrasing of the argument I’ve made, in terms of a trilemma. Out of the following three postulates, you cannot consistently accept all three:

  1. There are facts about consciousness.
  2. Facts about consciousness are not logically entailed by the Schrodinger equation (substitute in whatever the fundamental laws of physics end up being).
  3. Facts about consciousness are fundamentally facts about physics.

Denying (1) makes you an eliminativist. Presumably this is out of the question; consciousness is the only thing in the universe that we can know with certainty exists, as it is the only thing that we have direct first-person access to. Indeed, all the rest of our knowledge comes to us by means of our conscious experience, making it in some sense the root of all of our knowledge. The only charitable interpretations I have of eliminativism involve semantic arguments subtly redefining what we mean by “consciousness” away from “that thing which we all know exists from first-hand experience” to something whose existence can actually be cast doubt on.

Denying (2) seems really implausible to me for the considerations given above.

So denying (3) looks like our only way out.

Okay, so let’s suppose physicalism is wrong. This is already super important. If we accept this argument, then we have a worldview in which consciousness is of fundamental importance to the nature of reality. The list of fundamental facts about the universe will be (1) the laws of physics and (2) the laws of consciousness. This is really surprising for anybody like me that professes a secular worldview that places human beings far from the center of importance in the universe.

But “what about naturalism?” is not the only objection to this position. There’s a much more powerful argument.

Against non-physicalism

Suppose we now think that the fundamental facts about the universe fall into two categories: P (the fundamental laws of physics, plus the initial conditions of the universe) and Q (the facts about consciousness). We’ve already denied that P = Q or that there is a logical entailment relationship from P to Q.

Now we can ask about the causal nature of the psychophysical laws. Does P cause Q? Does Q cause P? Does the causation go both ways?

First, conditional on the falsity of physicalism, we can quickly rule out theories that claim that Q causes P (i.e. dualist theories). This is the old Cartesian picture that is unsatisfactory exactly because of the strength of the physical laws we’ve discovered. In short, physics appears to be causally complete. If you fix the structure and functioning on the microscopic level, then you fix the structure and functioning on the macroscopic level. In the language of philosophy, macroscopic physical facts supervene upon microscopic physical facts.

But now we have a problem. If all of our behavior and functioning is fully causally accounted for by physical facts, then what is there for Q (consciousness) to play a causal role in? Precisely nothing!

We can phrase this in the following trilemma (again, all three of these cannot be simultaneously true):

  1. Physicalism is false.
  2. Physics is causally closed.
  3. Consciousness has a causal influence on the physical world.

Okay, so now we have ruled out any theories in which Q causes P. But now we reach a new and even more damning conclusion. Namely, if facts about consciousness have literally no causal influence on any aspect of the physical world, then they have no causal influence, in particular, on your thoughts and beliefs about your consciousness.

Stop to consider for a moment the implications of this. We take for granted that we are able to form accurate beliefs about our own conscious experiences. When we are experiencing red, we are able to reliably produce accurate beliefs of the form “I am experiencing red.” But if the causal relationship goes from P to Q, then this becomes extremely hard to account for.

What would we expect to happen if our self-reports of our consciousness fell out of line with our actual consciousness? Suppose that you suddenly noticed yourself verbalizing “I’m really having a great time!” when you actually felt like you were in deep pain and discomfort. Presumably the immediate response you would have would be confusion, dismay, and horror. But wait! All of these experiences must be encoded in your brain state! In other words, to experience horror at the misalignment of your reports about your consciousness and your actual consciousness, it would have to be the case that your physical brain state would change in a particular way. And a necessary component of the explanation for this change would be the actual state of your consciousness!

This really gets to the heart of the weirdness of epiphenomenalism (the view that P causes Q, but Q doesn’t causally influence P). If you’re an epiphenomenalist, then all of your beliefs and speculations about consciousness are formed exactly as they would be if your conscious state were totally different. The exact same physical state of you thinking “Hey, this coffee cake tastes delicious!” would arise even if the coffee cake actually tasted like absolute shit.

To be sure, you would still “know” on the inside, in the realm of your direct first-person experience that there was a horrible mismatch occurring between your beliefs about consciousness and your actual conscious experience. But you couldn’t know about it in any way that could be traced to any brain state of yours. So you couldn’t form beliefs about it, feel shocked or horrified about it, have any emotional reactions to it, etc. And if every part of your consciousness is traceable back to your brain state, then your conscious state must be in some sense “blind” to the difference between your conscious state and your beliefs about your conscious state.

This is completely absurd. On the epiphenomenalist view, any correlation between the beliefs you form about consciousness and the actual facts about your conscious state couldn’t possibly be explained by the actual facts about your consciousness. So they must be purely coincidental.

In other words, the following two statements cannot be simultaneously accepted:

  • Consciousness does not causally influence our behavior.
  • Our beliefs about our conscious states are more accurate than random guessing.

So where does that leave us?

It leaves us in a very uncomfortable place. First of all, we should deny physicalism. But the denial of physicalism leaves us with two choices: either Q causes P or it does not.

We should deny the first, because otherwise we are accepting the causal incompleteness of physics.

And we should deny the second, because it leads us to conclude that essentially all of our beliefs about our conscious experiences are almost certainly wrong, undermining all of our reasoning that led us here in the first place.

So here’s a summary of this entire post so far. It appears that the following four statements cannot all be simultaneously true. You must pick at least one to reject.

  1. There are facts about consciousness.
  2. Facts about consciousness are not logically entailed by the Schrodinger equation (substitute in whatever the fundamental laws of physics end up being).
  3. Physics is causally closed.
  4. Our beliefs about our conscious states are more accurate than random guessing.

Eliminativists deny (1).

Physicalists deny (2).

Dualists deny (3).

And epiphenomenalists must deny (4).

I find that the easiest to deny of these four is (2). This makes me a physicalist, but not because I think that physicalism is such a great philosophical position that everybody should hold. I’m a physicalist because it seems like the least horrible of all the horrible positions available to me.

Counters and counters to those counters

A response that I would have once given when confronted by these issues would be along the lines of: “Look, clearly consciousness is just a super confusing topic. Most likely, we’re just thinking wrong about the whole issue and shouldn’t be taking the notion of consciousness so seriously.”

Part of this is right. Namely, consciousness is a super confusing topic. But it’s important to clearly delineate between which parts of consciousness are confusing and which parts are not. I’m super confused about how to make sense of the existence of consciousness, how to fit consciousness into my model of reality, and how to formalize my intuitions about the nature of consciousness. But I’m definitively not confused about the existence of consciousness itself. Clearly consciousness, in the sense of direct first-person experience, exists, and is a property that I have. The confusion arises when we try to interpret this phenomenon.

In addition, “X is super confusing” might be a true statement and a useful acknowledgment, but it doesn’t necessarily push us in one direction over another when considering alternative viewpoints on X. So “X is super confusing” isn’t evidence for “We should be eliminativists about X” over “We should be realists about X.” All it does is suggest that something about our model of reality needs fixing, without pointing to which particular component it is that needs fixing.

One more type of argument that I’ve heard (and maybe made in the past, to my shame) is a “scientific optimism” style of argument. It goes:

Look, science is always confronted with seemingly unsolvable mysteries.  Brilliant scientists in each generation throw their hands up in bewilderment and proclaim the eternal unsolvability of the deep mystery of their time. But then a few generations later, scientists end up finding a solution, and putting to shame all those past scientists that doubted the power of their art.

Consciousness is just this generation’s “great mystery.” Those that proclaim that science can never explain the conscious in terms of the physical are wrong, just as Lord Kelvin was wrong when he affirmed that the behavior of living organisms cannot be explained in terms of purely physical forces, and required a mysterious extra element (the ‘vital principle’ as he termed it).

I think that as a general heuristic, “Science is super powerful and we should be cautious before proclaiming the existence of specific limits on the potential of scientific inquiry” is pretty damn good.

But at the same time, I think that there are genuinely good reasons, reasons that science skeptics in the past didn’t have, for affirming the uniqueness of consciousness in this regard.

Lord Kelvin was claiming that there were physical behaviors that could not be explained by appeal to purely physical forces. This is a very different claim from the claim that there are phenomenon that are not purely logically reducible to structural properties of matter, that cannot be explained by purely physical forces. This, it seems to me, is extremely significant, and gets straight to the crux of the central mystery of consciousness.

Getting evidence for a theory of consciousness

I’ve been reading about the integrated information theory of consciousness lately, and wondering about the following question. In general, what are the sources of evidence we have for a theory of consciousness?

One way to think about this is to imagine yourself teleported hundreds of years into the future and talking to a scientist in this future world. This scientist tells you that in his time, consciousness is fully understood. What sort of experiments would you expect to be able to run to verify for yourself that the future’s theory of consciousness really is sufficient?

One thing you could do is point to a bunch of different physical systems, ask the scientist what his theory of consciousness says about them, and compare them to your intuitions. So, for instance, does the theory say that you are conscious? What about humans in general? What about people in deep sleep? How about dogs? Chickens? Frogs? Insects? Bacterium? Are Siri-style computer programs conscious? What about a rock? And so on.

The obvious problem with this is that it assumes the validity of your intuitions about consciousness. Sure it seems obvious that a rock is not conscious, that humans generally are, and that dogs are conscious, but less so than humans, but how do we know that these are trustworthy intuitions?

I think the validity of these intuitions is necessarily grounded in our phenomenology and our observations of how it correlates with our physical substance. So, for instance, I notice that when I fall asleep, my consciousness fades in and out. On the other hand, when I wiggle my big toe, this has an effect on the character of my conscious experience, but doesn’t shut it off entirely. This tells me that something about what happens to my body when I fall asleep is relevant to the maintenance of my consciousness, while the angle of my big toe is not.

In general, we make many observations like these and piece together a general theory of how consciousness relates to the physical world, not just in terms of the existence of consciousness, but also in terms of what specific conscious experiences we expect for a given change to our physical system. It tells us, for instance, that receiving a knock on the head or drinking too much alcohol is sometimes sufficient to temporarily suspend consciousness, while breaking a finger or cutting your hair is not.

Now, since we are able to intervene on our physical body at will and observe the results, our model is a causal model. An implication of this is that it should be able to handle counterfactuals. So, for instance, it can give us an answer to the question “Would I still be conscious if I cut my hair off, changed my skin color, shrunk several inches in height, and got a smaller nose?” This answer is presumably yes, because our theory distinguishes between physical features that are relevant to the existence of consciousness and those that are not.

Extending this further, we can ask if we would still be conscious if we gradually morphed into another human being, with a different brain and body. Again, the answer would appear to be yes, as long as nothing essential to the existence of consciousness is severed along the way. But now we are in a position to be able to make inferences about the existence of consciousness in bodies outside our own! For if I think that I would be conscious if I slowly morphed into my boyfriend, then I should also believe that my boyfriend is conscious himself. I could deny this by denying that the same physical states give rise to the same conscious states, but while this is logically possible, it seems quite implausible.

This gives rational grounds for our belief in the existence of consciousness in other humans, and allows us justified access to all of the work in neuroscience analyzing the connection between the brain and consciousness. It also allows us to have a baseline level of trust in the self-reports of other people about their conscious experiences, given the observation that we are generally reliable reporters of our conscious experience.

Bringing this back to our scientist from the future, I can think of some much more convincing tests I would do than the ‘tests of intuition’ that we did at first. Namely, suppose that the scientist was able to take any description of an experience, translate that into a brain state, and then stimulate your brain in such a way as to produce that experience for you. So over and over you submit requests – “Give me a new color experience that I’ve never had before, but that feels vaguely pinkish and bluish, with a high pitch whine in the background”, “Produce in me an emotional state of exaltation, along with the sensation of warm wind rushing through my hair and a feeling of motion”, etc – and over and over the scientist is able to excellently match your request. (Also, wow imagine how damn cool this would be if we could actually do this.)

You can also run the inverse test: you tell the scientist the details of an experience you are having while your brain is being scanned (in such a way that the scientist cannot see it). Then the scientist runs some calculations using their theory of consciousness and makes some predictions about what they’ll see on the brain scan. Now you check the brain scan to see if their predictions have come true.

To me, repeated success in experiments of this kind would be supremely convincing. If a scientist of the future was able to produce at will any experience I asked for (presuming my requests weren’t too far out as to be physical impossible), and was able to accurately translate facts about my consciousness into facts about my brain, and could demonstrate this over and over again, I would be convinced that this scientist really does have a working theory of consciousness.

And note that since this is all rooted in phenomenology, it’s entirely uncoupled from our intuitive convictions about consciousness! It could turn out that the exact framework the scientist is using to calculate the connections between my physical body and my consciousness end up necessarily entailing that rocks are conscious and that dolphins are not. And if the framework’s predictive success had been demonstrated with sufficient robustness before, I would just have to accept this conclusion as unintuitive but true. (Of course, it would be really hard to imagine how any good theory of consciousness could end up coming to this conclusion, but that’s beside the point.)

So one powerful source of evidence we have for testing a theory of consciousness is the correlations between our physical substance and our phenomenology. Is that all, or are there other sources of evidence tout there?

We can straightforwardly adopt some principles from the philosophy of science, such as the importance of simplicity and avoiding overfitting in formulating our theories. So for instance, one theory of consciousness might just be an exhaustive list of every physical state of the brain and what conscious experience this corresponds to. In other words, we could imagine a theory in which all of the basic phenomenological facts of consciousness are taken as individual independent axioms. While this theory will be fantastically accurate, it will be totally worthless to us, and we’d have no reason to trust its predictive validity.

So far, we really just have three criteria for evidence:

  1. Correlations between phenomenology and physics
  2. Simplicity
  3. Avoiding overfitting

As far as I’m concerned, this is all that I’m really comfortable with counting as valid evidence. But these are very much not the only sources of evidence that get referenced in the philosophical literature. There are a lot of arguments that get thrown around concerning the nature of consciousness that I find really hard to classify neatly, although often these arguments feel very intuitively appealing. For instance, one of my favorite arguments for functionalism is David Chalmers’ ‘Fading Qualia’ argument. It goes something like this:

Imagine that scientists of the future are able to produce silicon chips that are functionally identical to neurons and can replicate all of their relevant biological activity. Now suppose that you undergo an operation in which gradually, every single part of your nervous system is substituted out for silicon. If the biological substrate implementing the functional relationships is essential to consciousness, then by the end of this procedure you will no longer be conscious.

But now we ask: when did the consciousness fade out? Was it a sudden or a gradual process? Both seem deeply implausible. Firstly, we shouldn’t expect a sudden drop-out of consciousness from the removal of a single neuron or cluster of neurons, as this would be a highly unusual level of discreteness. This would also imply the ability to switch on and off the entirety of your consciousness with seemingly insignificant changes to the biological structure of your nervous system.

And secondly, if it is a gradual process, then this implies the existence of “pseudo-conscious” states in the middle of the procedure, where your experiences are markedly distinct from those of the original being but you are pretty much always wrong about your own experiences. Why? Well, the functional relationships have stayed the same! So your beliefs about your conscious states, the memories you form, the emotional reactions you have, will all be exactly as if there has been no change to your conscious states. This seems totally bizarre and, in Chalmers’ words, “we have little reason to believe that consciousness is such an ill-behaved phenomenon.”

Now, this is a fairly convincing argument to me. But I have a hard time understanding why it should be. The argument’s convincingness seems to rely on some very high-level abstract intuitions about the types of conscious experiences we imagine organisms could be having, and I can’t think of a great reason for trusting these intuitions. Maybe we could chalk it up to simplicity, and argue that the notion of consciousness entailed by substrate-dependence must be extremely unparsimonious. But even this connection is not totally clear to me.

A lot of the philosophical argumentation about consciousness feels this way to me; convincing and interesting, but hard to make sense of as genuine evidence.

One final style of argument that I’m deeply skeptical of is arguments from pure phenomenology. This is, for instance, how Giulio Tononi likes to argue for his integrated information theory of consciousness. He starts from five supposedly self-evident truths about the character of conscious experience, then attempts to infer facts about the structure of the physical systems that could produce such experiences.

I’m not a big fan of Tononi’s observations about the character of consciousness. They seem really vaguely worded and hard enough to make sense of that I have no idea if they’re true, let alone self-evident. But it is his second move that I’m deeply skeptical of. The history of philosophers trying to move from “self-evident intuitive truths” to “objective facts about reality” is pretty bad. While we might be plenty good at detailing our conscious experiences, trying to make the inferential leap to the nature of the connection between physics and consciousness is not something you can do just by looking at phenomenology.