A closer look at anthropic tests for consciousness

(This post is the culmination of my last week of posts on anthropics and conservation of expected evidence.)

In this post, I described how anthropic reasoning can apparently give you a way to update on theories of consciousness. This is already weird enough, but I want to make things a little weirder. I want to present an argument that in fact anthropic reasoning implies that we should be functionalists about consciousness.

But first, a brief recap (for more details see the post linked above):

Screen Shot 2018-08-09 at 9.09.08 AM

Thus…

Screen Shot 2018-08-09 at 9.15.37 AM.pngScreen Shot 2018-08-09 at 9.19.18 AM

Whenever this experiment is run, roughly 90% of experimental subjects observe snake eyes, and roughly 10% observe not snake eyes. What this means is that 90% of the people update in favor of functionalism (by a factor of 9), and only 10% of people update in favor of substrate dependence theory (also by a factor of 9).

Now suppose that we have a large population that starts out completely agnostic on the question of functionalism vs. substrate dependence. That is, the prior ratio for each individual is 1:

Screen Shot 2018-08-09 at 9.28.15 AM

Now imagine that we run arbitrarily many dice-killer experimental setups on the population. We would see an upwards drift in the average beliefs of the population towards functionalism. And in the limit of infinite experiments, we would see complete convergence towards functionalism as the correct theory of consciousness.

Now, the only remaining ingredient is what I’ve been going on about the past two days: if you can predict beforehand that a piece of evidence is going to make you on average more functionalist, then you should preemptively update in favor of functionalism.

What we end up with is the conclusion that considering the counterfactual infinity of experimental results we could receive, we should conclude with arbitrarily high confidence that functionalism is correct.

To be clear, the argument is the following:

  1. If we were to be members of a population that underwent arbitrarily many dice-killer trials, we would converge towards functionalism.
  2. Conservation of expected evidence: if you can predict beforehand which direction some observation would move you, then you should pre-emptively adjust your beliefs in that direction.
  3. Thus, we should preemptively converge towards functionalism.

Premise 1 follows from a basic application of anthropic reasoning. We could deny it, but doing so amounts to denying the self-sampling assumption and ensuring that you will lose in anthropic games.

Premise 2 follows from the axioms of probability theory. It is more or less the statement that you should update your beliefs with evidence, even if this evidence is counterfactual information about the possible results of future experiments.

(If this sounds unintuitive to you at all, consider the following thought experiment: We have two theories of cosmology, one in which 99% of people live in Region A and 1% in Region B, and the other in which 1% live in Region A and 99% in Region B. We now ask where we expect to find ourselves. If we expect to find ourselves in Region A, then we must have higher credence in the first theory than the second. And if we initially did not have this higher credence, then considering the counterfactual question “Where would I find myself if I were to look at which region I am in?” should cause us to update in favor of the first theory.)

Altogether, this argument looks really bullet proof to me. And yet its conclusion seems very wrong.

Can we really conclude with arbitrarily high certainty that functionalism is correct by just going through this sort of armchair reasoning from possible experimental results that we will never do? Should we now be hardcore functionalists?

I’m not quite sure yet what the right way to think about this is. But here is one objection I’ve thought of.

We have only considered one possible version of the dice killer thought experiment (in which the experimenter starts off with 1 human, then chooses 1 human and 9 androids, then 1 human and 99 androids, and so on). In this version, observing snake eyes was evidence for functionalism over substrate dependence theory, which is what causes the population-wide drift towards functionalism.

We can ask, however, if we can construct a variant of the dice killer thought experiment in which snake eyes counts as evidence for substrate dependence theory over functionalism. If so, then we could construct an experimental setup that we can predict beforehand will end up with us converging with arbitrary certainty to substrate dependence theory!

Let’s see how this might be done. We’ll imagine the set of all variants on the thought experiment (that is, the set of all choices the dice killer could make about how many humans and androids to kidnap in each round.)

Screen Shot 2018-08-10 at 12.32.28 AM

For ease of notation, we’ll abbreviate functionalism and substrate dependence theory as F and S respectively.

Screen Shot 2018-08-10 at 12.32.57 AM

And we’ll also introduce a convenient notation for calculating the total number of humans and the total number androids ever kidnapped by round N.

Screen Shot 2018-08-10 at 12.33.41 AM

Now, we want to calculate the probability of snake eyes given functionalism in this general setup, and compare it to the probability of snake eyes given substrate dependence theory. The first step will be to consider the probability of snake eyes if  the experiment happens to end on the nth round, for some n. This is just the number of individuals in the last round divided by the total number of kidnapped individuals.

Screen Shot 2018-08-10 at 12.35.06 AM

Now, we calculate the average probability of snake eyes (the average fraction of individuals in the last round).

Screen Shot 2018-08-10 at 12.36.08 AM

The question is thus if we can find a pair of sequences

Screen Shot 2018-08-10 at 12.41.24 AM

such that the first term is larger than the second.

Screen Shot 2018-08-10 at 12.45.29 AM.png

It seems hard to imagine that there are no such pairs of sequences that satisfy this inequality, but thus far I haven’t been able to find an example. For now, I’ll leave it as an exercise for the reader!

If there are no such pairs of sequences, then it is tempting to take this as extremely strong evidence for functionalism. But I am concerned about this whole line of reasoning. What if there are a few such pairs of sequences? What if there are far more in which functionalism is favored than those in which substrate dependence is favored? What if there are an infinity of each?

While I buy each step of the argument, it seems wrong to say that the right thing to do is to consider the infinite set of all possible anthropic experiments you could do, and then somehow average over the results of each to determine the direction in which we should update our theories of consciousness. Indeed, I suspect that any such averaging procedure would be vulnerable to arbitrariness in the way that the experiments are framed, such that different framings give different results.

At this point, I’m pretty convinced that I’m making some fundamental mistake here, but I’m not sure exactly where this mistake is. Any help from readers would be greatly appreciated. 🙂

Explaining anthropic reasoning

I realize that I’ve been a little unclear in my last few posts. I presupposed a degree of familiarity with anthropic reasoning that most people don’t have. I want to remedy that by providing a short explanation of what anthropic reasoning is and why it is useful.

First of all, one thing that may be confusing is that the term ‘anthropic reasoning’ is used in multiple very different ways. In particular, its most common usage is probably in arguments about the existence of God, where it is sometimes presented as an argument against the evidential force of fine tuning. I have no interest in this, so please don’t take me to be using the term this way. My usage is identical with that of Nick Bostrom, who wrote a fantastic book about anthropic reasoning. You’ll see precisely what this usage entails shortly, but I just want to plant a flag now in case I use the word ‘anthropic’ in a way that you are unfamiliar with.

Good! Now, let’s start with a few thought experiments.

  1. Suppose that the universe consists of one enormous galaxy, divided into a central region and an outer region. The outer region is densely populated with intelligent life, containing many trillions of planetary civilizations at any given moment. The inner region is hostile to biology, and at any given time only has a few hundred planetary civilizations. It is impossible for life to develop beyond the outer region of the galaxy.

    Now, you are a member of a planetary civilization that knows all of this, but doesn’t know its location in the galaxy. You reason that it is:

    (a) As likely that you are in the central region as it is that you are in the the outer region
    (b) More likely
    (c) Less likely

  2. Suppose that the universe consists of one galaxy that goes through life phases. In its early phase, life is very rare and the galaxy is typically populated by only a few hundred planetary civilizations. In its middle phase, life is plentiful and the galaxy is typically populated by billions of planetary civilizations. And in its final phase, which lasts for the rest of the history of the universe, it is impossible for life to evolve.

    You are born into a planetary civilization that knows all of this, but doesn’t know what life phase the galaxy is in. You reason that it is:

    (a) As likely that you are in the early phase as the middle phase
    (b) More likely
    (c) Less likely

  3. You are considering two competing theories of cosmology. In Cosmology X, 1% of life exists in Region A and 99% in Region B. In Cosmology Y, 99% of life is in Region A and 1% in Region B. You currently don’t know which region you are in, and have equal credence in Cosmology X and Cosmology Y.

    Now you perform an experiment that locates yourself in the universe. You find that you are in Region A. How should your beliefs change?

    (a) They should stay the same
    (b) Cosmology X becomes more likely than Cosmology Y
    (c) Cosmology Y becomes more likely than Cosmology X

If you answered (c) for all three, then congratulations, you’re already an expert anthropic reasoner!

What we want to do is explain why (c) was the right answer in all three cases, and see if we can unearth any common principles. You might think that this is unnecessary; after all, aren’t we just using a standard application of Bayes’ theorem? Sort of, but there’s a little more going on here. Consider, for instance the following argument:

1. Most people have property X,
2. Therefore, I probably have property X.

Ignoring the base rate fallacy here, there is an implicit assumption involved in the jump from 1 to 2. This assumption can be phrased as follows:

I should reason about myself as if I am randomly sampled from the set of all people.

A similar principle turns out to be implicit in the reasoning behind our answers to the three starting questions. For question 1, it was something like

I should reason about myself as if I am randomly sampled from the set of all intelligent organisms in the universe at this moment.

For 2, it might be

I should reason about myself as if I am randomly sampled from the set of all intelligent organisms in the history of the universe.

And for 3, it is pretty much the same as 1:

I should reason about myself as if I am randomly sampled from all intelligent organisms in the universe.

These various sampling assumptions really amount to the notion that we should reason about ourselves the same way we reason about anything else. If somebody hands us a marble from an urn that contains 99% black marbles, (and we have no other information) we should think this marble has a 99% chance of being black. If we learn that 99% of individuals like us exist in Region A rather than Region B (and we have no other information), then we should think that we have a 99% chance of being in Region A.

In general, we can assert the Self-Sampling Assumption (SSA):

SSA: In the absence of more information, I should reason about myself as if I am randomly sampled from the set of all individuals like me.

The “individuals like me” is what gives this principle the versatility to handle all the various cases we’ve discussed so far. It’s slightly vague, but will do for now.

And now we have our first anthropic principle! We’ve seen how eminently reasonable this principle is in the way that it handles the cases we started with. But at the same time, accepting this basic principle pretty quickly leads to some unintuitive conclusions. For instance:

  1. It’s probably not the case that there are other intelligent civilizations that have populations many times larger than ours (for instance, galactic societies).
  2. It’s probably not the case that we exist in the first part of a long and glorious history of humanity in which we expand across space and populate the galaxy (this is called the Doomsday argument).
  3. On average, you are probably pretty average in most ways. (Though there might be a selection effect to be considered in who ends up regularly reading this blog.)

These are pretty dramatic conclusions for a little bit of armchair reasoning! Can it really be that we can assert the extreme improbability of a glorious future and the greater likelihood of doomsday from simply observing our birth order in the history of humanity? Can we really draw these types of conclusions about the probable distributions of intelligent life in our universe from simply looking at facts about the size of our species?

It is tempting to just deny that this reasoning is valid. But to do so is to reject the simple and fairly obvious-seeming principle that justified our initial conclusions. Perhaps we can find some way to accept (c) as the answer for the three questions we started with while still denying the three conclusions I’ve just listed, but it’s not at all obvious how.

Just to drive the point a little further, let’s look at (2) – the Doomsday argument – again. The argument is essentially this:

Consider two theories of human history. In Theory 1, humans have a brief flash of exponential growth and planetary domination, but then go extinct not much later. In this view, we (you and me) are living in a fairly typical point in the history of humanity, existing near its last few years when its population is greatest.

In Theory 2, humans continue to expand and expand, spreading civilization across the solar system and eventually the galaxy. In this view, the future of humanity is immense and glorious, and involves many trillions of humans spread across hundreds or thousands of planets for many hundreds of thousands of years.

We’d all like Theory 2 to be the right one. But when we consider our place in history, we must admit that it seems incredibly less likely for us to be in the very tiny period of human history in which we still exist on one planet, than it is for us to be in the height of human history where most people live.

By analogy, imagine a bowl filled with numbered marbles. We have two theories about the number of marbles in the bowl. Theory 1 says that there are 10 marbles in the bowl. Theory 2 says that there are 10,000,000. Now we draw a marble and see that it is numbered 7. How should this update our credences in these two theories?

Well, on Theory 2, getting a 7 is one million times less likely than it is on Theory 1. So Theory 1 gets a massive evidential boost from the observation. In fact, if we consider the set of all possible theories of how many marbles there are in the jar, the greatest update goes to the theory that says that there are exactly 7 marbles. Theories that say any fewer than 7 are made impossible by the observation, and theories that say more than 7 are progressively less likely as the number goes up.

This is exactly analogous to our birth order in the history of humanity. The self-sampling assumption says that given that you are a human, you should treat yourself as if you are randomly sampled from the set of all humans there will ever be. If you are, say, the one trillionth human, then the most likely theory is that there are not many more than a trillion humans that will ever exist. And theories that say there will be fewer than a trillion humans are ruled out definitively by the observation. Comparing the theory that says there will be a trillion trilllion humans throughout history to the theory that says there will be a trillion humans throughout history, the first is a trillion times less likely!

In other words, applying the self-sampling assumption to your birth order in the history of humanity, we update in favor of a shortly upcoming doomsday. To be clear, this is not the same as saying that doomsday soon is inevitable and that all other sources of evidence for doomsday or not-doomsday are irrelevant. This is just another piece of evidence to be added to the set of all evidence we have when drawing inferences about the future of humanity, albeit a very powerful one.

Okay, great! So far we’ve just waded into anthropic reasoning. The self-sampling assumption is just one of a few anthropic principles that Nick Bostrom discusses, and there are many other mind boggling implications of this style of reasoning. But hopefully I have whetted your appetite for more, as well as given you a sense that this style of reasoning is both nontrivial to refute and deeply significant to our reasoning about our circumstance.

In favor of anthropic reasoning

Often the ideas in my recent posts regarding anthropic reasoning and the Dice Killer thought experiment are met with skepticism. The sense is that something about the reasoning process being employing is rotten, and that the simple intuitive answers are right after all.

While I understand the impulse to challenge these unusual ideas, I think they have more going for them than might be obvious. In this post, I’ll present a basic argument for why we should reason anthropically: because doing so allows you to win!

We can see this in the basic Dice Killer scenario. (I won’t rehash the details of the thought experiment here, but you can find them at this link).

The non-anthropic reasoner saw a 50% chance of death if they tried escaping and only a 3% chance of death if they didn’t. The anthropic reasoner saw a 50% chance of dying if they tried escaping and a 90% chance of death if not. Naturally, the anthropic reasoner takes the escape route, and the non-anthropic reasoner does not. Now, how do these strategies compare?

Suppose that all of those that gets kidnapped are non-anthropic reasoners. Then none of them try escaping, so about 90% of them end up dying in the last round. What if they are all anthropic reasoners? Then they all try escaping, so only 50% of them die.

This is clearly a HUGE win for anthropic reasoning. Anthropic reasoners run a 40% decreased chance of dying! A simple explanation for this is that they’re simply taking advantage of all the information available to them, including indexical information about their state of being.

We can also construct variants of this thought experiment in which non-anthropic reasoners end up taking bets that lose them money on average, while anthropic reasoners always avoid such losing bets. These thought experiments run on the same basic principle in the Dice Killer scenario – sometimes you can construct deals that look net positive until you take the anthropic perspective, at which point they turn net negative.

In other words, if somebody refuses to use anthropic reasoning, you can turn them into a money pump, taking more and more of their money until they change their mind! This is a pragmatic argument for why even if you find this form of reasoning to be unusual and unintuitive, you should take it seriously.

Clarifying self-defeating beliefs

In a previous post, I mentioned self-defeating beliefs as a category that I am confused about. I wrote:

How should we reason about self defeating beliefs?

The classic self-defeating belief is “This statement is a lie.” If you believe it, then you are compelled to disbelieve it, eliminating the need to believe it in the first place. Broadly speaking, self-defeating beliefs are those that undermine the justifications for belief in them.

Here’s an example that might actually apply in the real world: Black holes glow. The process of emission is known as Hawking radiation. In principle, any configuration of particles with a mass less than the black hole can be emitted from it. Larger configurations are less likely to be emitted, but even configurations such as a human brain have a non-zero probability of being emitted. Henceforth, we will call such configurations black hole brains.

Now, imagine discovering some cosmological evidence that the era in which life can naturally arise on planets circling stars is finite, and that after this era there will be an infinite stretch of time during which all that exists are black holes and their radiation. In such a universe, the expected number of black hole brains produced is infinite (a tiny finite probability multiplied by an infinite stretch of time), while the expected number of “ordinary” brains produced is finite (assuming a finite spatial extent as well).

What this means is that discovering this cosmological evidence should give you an extremely strong boost in credence that you are a black hole brain. (Simply because most brains in your exact situation are black hole brains.) But most black hole brains have completely unreliable beliefs about their environment! They are produced by a stochastic process which cares nothing for producing brains with reliable beliefs. So if you believe that you are a black hole brain, then you should suddenly doubt all of your experiences and beliefs. In particular, you have no reason to think that the cosmological evidence you received was veridical at all!

I don’t know how to deal with this. It seems perfectly possible to find evidence for a scenario that suggests that we are black hole brains (I’d say that we havealready found such evidence, multiple times). But then it seems we have no way to rationally respond to this evidence! In fact, if we do a naive application of Bayes’ theorem here, we find that the probability of receiving any evidence in support of black hole brains to be 0!

So we have a few options. First, we could rule out any possible skeptical scenarios like black hole brains, as well as anything that could provide anyamount of evidence for them (no matter how tiny). Or we could accept the possibility of such scenarios but face paralysis upon actually encountering evidence for them! Both of these seem clearly wrong, but I don’t know what else to do.

A friend (whose blog Compassionate Equilibria you should definitely check out) left a comment in response, saying:

I think I feel somewhat less confused about self-defeating beliefs (at least when considering the black hole brain scenario maybe I would feel more confused about other cases).

It seems like the problem might be when you say “imagine discovering some cosmological evidence that the era in which life can naturally arise on planets circling stars is finite, and that after this era there will be an infinite stretch of time during which all that exists are black holes and their radiation.” Presumably, whatever experience you had that you are interpreting as this cosmological evidence is an experience that you would actually be very unlikely to have given that you exist in that universe and as a result shouldn’t be interpreted as evidence for existing in such a universe. Instead you would have to think about in what kind of universe would you be most likely to have those experiences that naively seemed to indicate living in a universe with an infinity of black hole brains.

This could be a very difficult question to answer but not totally intractable. This also doesn’t seem to rule out starting with a high prior in being a black hole brain and it seems like you might even be able to get evidence for being a black hole brain (although I’m not sure what this would be; maybe having a some crazy jumble of incoherent experiences while suddenly dying?).

I think this is a really good point that clears up a lot of my confusion on the topic. My response ended up being quite long, so I’ve decided to make it its own post.

 

*** My response starts here ***

 

The key point that I was stuck on before reading this comment was the notion that this argument puts a strong a priori constraint on the types of experiences we can expect to have. This is because P(E) is near zero when E strongly implies a theory and that theory undermines E.

Your point, which seems right, is: It’s not that it’s impossible or near impossible to observe certain things that appear to strongly suggest a cosmology with an infinity of black hole brains. It’s that we can observe these things, and they aren’t actually evidence for these cosmologies (for just the reasons you laid out).

That is, there just aren’t observations that provide evidence for radical skeptical scenarios. Observations that appear to provide such evidence, prove to not do so upon closer examination. It’s about the fact that the belief that you are a black hole brain is by construction unmotivateable: this is what it means to say P(E) ~ 0. (More precisely, the types of observations that actually provide evidence for black hole brains are those that are not undermined by the belief in black hole brains. Your “crazy jumble of incoherent experiences” might be a good example of this. And importantly, basically any scientific evidence of the sort that we think could adjudicate between different cosmological theories will be undermined.)

One more thing as I digest this: Previously I had been really disturbed by the idea that I’d heard mentioned by Sean Carroll and others that one criterion for a feasible cosmology is that it doesn’t end up making it highly likely that we are black hole brains. This seemed like a bizarrely strong a priori constraint on the types of theories we allow ourselves to consider. But this actually makes a lot of sense if conceived of not as an a priori constraint but as a combination of two things: (1) updating on the strong experiential evidence that we are not black hole brains (the extremely structured and self-consistent nature of our experiences) and (2) noticing that these theories are very difficult to motivate, as most pieces of evidence that intuitively seem to support them actually don’t upon closer examination.

So (1) the condition that P(E) is near zero is not necessarily a constraint on your possible experiences, and (2) it makes sense to treat cosmologies that imply that we are black hole brains as empirically unsound and nearly unmotivateable.

Now, I’m almost all the way there, but still have a few remaining hesitations.

One thing is that things get more confusing when you break an argument for black hole brains down into its component parts and try to figure out where exactly you went wrong. Like, say you already have a whole lot of evidence that after a finite length of time, the universe will be black holes forever, but don’t yet know about Hawking radiation. So far everything is fine. But now scientists observe Hawking radiation. From this they conclude that black holes radiate, though they don’t have a theory of the stochastic nature of the process that entails that it can in principle produce brains. They then notice that Hawking radiation is actually predicted by combining aspects of QM and GR, and see that this entails that black holes can produce brains. Now they have all the pieces that together imply that they are black hole brains, but at which step did they go wrong? And what should they conclude now? They appear to have developed a mountain of solid evidence that when put together (and combined with some anthropic reasoning) straightforwardly imply that they are black hole brains. But this can’t be the case, since this would undermine the evidence they started with.

We can frame this as a multilemma. The general reasoning process that leads to the conclusion that we are black hole brains might look like:

  1. We observe nature.
  2. We generate laws of physics from these observations.
  3. We predict from the laws of physics that there is a greater abundance of black hole brains than normal brains.
  4. We infer from (3) that we are black hole brains (via anthropic reasoning).

Either this process fails at some point, or we should believe that we are black hole brains. Our multilemma (five propositions, at least one of which must be accepted) is thus:

  1. Our observations of nature were invalid.
  2. Our observations were valid, but our inference of laws of physics from them was invalid.
  3. Our inference of laws of physics from our observations were valid, but our inference from these laws of there being a greater abundance of black hole brains than normal brains was invalid.
  4.  Our inference from the laws of there being a greater abundance of black hole brains from normal brains was valid, but the anthropic step was invalid.
  5. We are black hole brains.

Clearly we want to deny (5). I also would want to deny (3) and (4) – I’m imagining them to be fairly straightforward deductive steps. (1) is just some form of skepticism about our access to nature, which I also want to deny. The best choice, it looks like, is (2): our inductive inference of laws of physics from observations of nature is flawed in some way. But even this is a hard bullet to bite. It’s not sufficient to just say that other laws of physics might equally well or better explain the data. What is required is to say that in fact our observations don’t really provide compelling evidence for QM, GR, and so on.

So the end result is that I pretty much want to deny every possible way the process could have failed, while also denying the conclusion. But we have to deny something! This is clearly not okay!

Summing up: The remaining disturbing thing to me is that it seems totally possible to accidentally run into a situation where your best theories of physics inevitably imply (by a process of reasoning each step of which you accept is valid) that you are a black hole brain, and I’m not sure what to do next at that point.

Getting empirical evidence for different theories of consciousness

Previously, I described a thought experiment in which a madman kidnaps a person, then determines whether or not to kill them by rolling a pair of dice. If they both land 1 (snake eyes), then the madman kills the person. Otherwise, the madman lets them go and kidnaps ten new people. He rolls the dice again and if he gets snake eyes, kills all ten. Otherwise he lets them go and finds 100 new people. Et cetera until he eventually gets snake eyes, at which point he kills all the currently kidnapped people and stops his spree.

If you find that you have been kidnapped, then your chance of survival depends upon the dice landing snake eyes, which happens with probability 1/36. But we can also calculate the average fraction of people kidnapped that end up dying. We get the following:

Screen Shot 2018-08-02 at 1.16.15 AM

We already talked about how this is unusually high compared to the 1/36 chance of the dice landing snake eyes, and how to make sense of the difference here.

In this post, we’ll talk about a much stranger implication. To get there, we’ll start by considering a variant of the initial thought experiment. This will be a little weird, but there’s a nice payout at the end, so stick with it.

In our variant, our madman kidnaps not only people, but also rocks. (The kidnapper does not “rock”, he kidnaps pieces of stones). He starts out by kidnapping a person, then rolls his dice. Just like before, if he gets snake eyes, he kills the person. And if not, he frees the person and kidnaps a new group. This new group consists of 1 person and 9 rocks. Now if the dice come up snake eyes, the person is killed and the 9 rocks pulverized. And if not, they are all released, and 1 new person and 99 rocks are gathered.

To be clear, the pattern is:

First Round: 1 person
Second Round: 1 person, 9 rocks
Third Round: 1 person, 99 rocks
Fourth Round: 1 person, 999 rocks
and so on…

Now, we can run the same sort of anthropic calculation as before:

Screen Shot 2018-08-02 at 1.16.33 AM.png

Evidently, this time you have roughly a 10% chance of dying if you find yourself kidnapped! (Notice that this is still worse than 1/36, though a lot better than 90%).

Okay, so we have two scenarios, one in which 90% of those kidnapped die and the other in which 10% of those kidnapped die.

Now let’s make a new variant on our thought experiment, and set it in a fictional universe of my creation.

In this world there exist androids – robotic intelligences that behave, look, and feel like any ordinary human. They are so well integrated into society that most people don’t actually know if they are a biological person or an android. The primary distinction between the two groups is, of course, that one has a brain made of silicon transistors and the other has a brain made of carbon-based neurons.

There is a question of considerable philosophical and practical importance in this world, which is: Are androids conscious just like human beings? This question has historically been a source of great strife in this world. On the one hand, some biological humans argue that the substrate is essential to the existence of consciousness and that therefore non-carbon-based life forms can never be conscious, no matter how well they emulate conscious beings. This thesis is known as the substrate-dependence view.

On the other hand, many argue that we have no good reason to dismiss the androids’ potential consciousness. After all, they are completely indistinguishable from biological humans, and have the same capacity to introspect and report on their feelings and experiences. Some android philosophers even have heated debates about consciousness. Plus, the internal organization of androids is pretty much identical to that of biological humans, indicating that the same sort of computation is going on in both organisms. It is argued that clearly consciousness arises from the patterns of computation in a system, and that on that basis androids are definitely conscious. The people that support this position are called functionalists (and, no great surprise, all androids that are aware that they are androids are functionalists).

The fundamental difference between the two stances can be summarized easily: Substrate-dependence theorists think that to be conscious, you must be a carbon-based life form operating on cells. Functionalists think that to be conscious, you must be running a particular type of computation, regardless of what material that computation is running on

In this world, the debate runs on endlessly. The two sides marshal philosophical arguments to support their positions and hurl them at each other with little to no effect. Androids insist vehemently that they are as conscious as anybody else, functionalists say “See?? Look at how obviously conscious they are,” and substrate-dependence theorists say “But this is exactly what you’d expect to hear from an unconscious replica of a human being! Just because you built a machine that can cleverly perform the actions of conscious beings does not mean that it really is conscious”.

It is soon argued by some that this debate can never be settled. This camp, known as the mysterians, says that there is something fundamentally special and intrinsically mysterious about the phenomenon that bars us from ever being able to answer these types of question, or even provide evidence for them. They point to the subjective nature of experience and the fact that you can only really know whether somebody is conscious by entering their head, which is impossible. The mysterians’ arguments are convincing to many, and their following grows stronger by the day as the debates between the other parties appear ever more futile.

With this heated debate in the backdrop, we can now introduce a new variant on the dice killer setup.

The killer starts like before by kidnapping a single human (not an android). If he rolls snake eyes, this person is killed. If not, he releases them and kidnaps one new human and nine androids. (Sounding familiar?)  If he rolls snake eyes, all ten are killed, and if not, one new person and 99 new androids are kidnapped. Etc. Thus we have:

First Round: 1 person
Second Round: 1 person, 9 androids
Third Round: 1 person, 99 androids
Fourth Round: 1 person, 999 androids
and so on…

You live in this society, and are one of its many citizens that doesn’t know if they are an android or a biological human. You find yourself kidnapped by the killer. How worried should you be about your survival?

If you are a substrate dependence theorist, you will see this case as similar to the variant with rocks. After all, you know that you are conscious. So you naturally conclude that you can’t be an android. This means that there is only one possible person that you could be in each round. So the calculation runs exactly as it did before with the rocks, ending with a 10% chance of death.

If you are a functionalist, you will see this case as similar to the case we started with. You think that androids are conscious, so you don’t rule out any of the possibilities for who you might be. Thus you calculate as we did initially, ending with a 90% chance of death.

Here we pause to notice something very important! Our two different theories of consciousness have made different empirically verifiable predictions about the world! And not only are they easily testable, but they are significantly different. The amount of evidence provided by the observation of snake eyes has to do with the likelihood ratio P(snake eyes | functionalism) / P(snake eyes | substrate dependence). This ratio is roughly 90% / 10% = 9, which means that observing snake eyes tilts the balance by a factor of 9 in favor of functionalism.

More precisely, we use the likelihood ratio to update our prior credences in functionalism and substrate dependence to our posterior credences. That is,

Screen Shot 2018-08-02 at 1.27.02 AM.png

This is a significant update. It can be made even more significant by altering the details of the setup. But the most important point is that there is an update at all. If what I’ve argued is correct, then the mysterians are demonstrably wrong. We can construct setups that test theories of consciousness, and we know just how!

(There’s an interesting caveat here, which is that this is only evidence for the individual that found themselves to be kidnapped. If an experimenter was watching from the outside and saw the dice land snake eyes, they would get no evidence for functionalism over . This relates to the anthropic nature of the evidence; it is only evidence for the individuals for whom the indexical claims “I have been kidnapped” and “I am conscious” apply.)

So there we have it. We’ve constructed an experimental setup that allows us to test claims of consciousness that are typically agreed to be beyond empirical verification. Granted, this is a pretty destructive setup and would be monstrously unethical to actually enact. But the essential features of the setup can be preserved without the carnage. Rather than snake eyes resulting in the killer murdering everybody kept captive, it could just result in the experimenter saying “Huzzah!” and ending the experiment. Then the key empirical evidence for somebody that has been captured would be whether or not the experimenter says “Huzzah!” If so, then functionalism becomes nine times more likely than it was before relative to substrate dependence.

This would be a perfectly good experiment that we could easily run, if only we could start producing some androids indistinguishable from humans. So let’s get to it, AI researchers!

The Anthropic Dice Killer

Today we discuss anthropic reasoning.

The Problem

Imagine the following scenario:

A mad killer has locked you in a room. You are trapped and alone, with only your knowledge of your situation to help you out.

One piece of information that you have is that you are aware of the maniacal schemes of your captor. His plans began by capturing one random person. He then rolled a pair of dice to determine their fate. If the dice landed snake eyes (both 1), then the captive would be killed. If not, then they would be let free.

But if they are let free, the killer will search for new victims, and this time bring back ten new people and lock them alone in rooms. He will then determine their fate just as before, with a pair of dice. Snake eyes means they die, otherwise they will be let free and he will search for new victims.

His murder spree will continue until the first time he rolls snake eyes. Then he will kill the group that he currently has imprisoned and retire from the serial-killer life.

Now. You become aware of a risky way out of the room you are locked in and to freedom. The chances of surviving this escape route are only 50%. Your choices are thus either (1) to traverse the escape route with a 50% chance of survival or (2) to just wait for the killer to roll his dice, and hope that it doesn’t land snake eyes.

What should you do?

 

 

 

(Think about it before reading on)

 

 

 

A plausible-sounding answer

Your chance of dying if you stay and wait is just the chance that the dice lands snake eyes. The probability of snake eyes is just 1/36 (1/6 for each dice landing 1).

So your chance of death is only 1/36 (≈ 3%) if you wait, and it’s 50% if you try to run for it. Clearly, you are better off waiting!

But…

You guessed it, things aren’t that easy. You have extra information about your situation besides just how the dice works, and you should use it. In particular, the killing pattern of your captor turns out to be very useful information.

Ask the following question: Out of all of the people that have been captured or will be captured at some point by this madman, how many of them will end up dying? This is just the very last group, which, incidentally, is the largest group.

Consider: if the dice land snake eyes the first time they are rolled, then only one person is ever captured, and this person dies. So the fraction of those captured that die is 100%.

If they lands snake eyes the second time they are rolled, then 11 people total are captured, 10 of whom die. So the fraction of those captured that die is 10/11, or ≈ 91%.

If it’s the third time, then 111 people total are captured, 100 of whom die. Now the fraction is just over 90%.

In general, no matter how many times the dice rolls before landing snake eyes, it always ends up that over 90% of those captured end up being in the last round, and thus end up dying.

So! This looks like bad news for you… you’ve been captured, and over 90% of those that are captured always die. Thus, your chance of death is guaranteed to be greater than 90%.

The escape route with a 50% survival chance is looking nicer now, right?

Wtf is this kind of reasoning??

What we just did is called anthropic reasoning. Anthropic reasoning really just means updating on all of the information available to you, including indexical information (information about your existence, age, location, and so on). In this case, the initial argument neglected the very crucial information that you are one of the people that were captured by the killer. When updating on this information, we get an answer that is very very different from what we started with. And in this life-or-death scenario, this is an important difference!

You might still feel hesitant about the answer we got. After all, if you expect a 90% chance of death, this means that you expect a 90% chance for the dice to land snake eyes. But it’s not that you think the dice are biased or anything… Isn’t this just blatantly contradictory?

This is a convincing-sounding rebuttal, but it’s subtly wrong. The key point is that even though the dice are fair, there is a selection bias in the results you are seeing. This selection bias amounts to the fact that when the dice inevitably lands snake-eyes, there are more people around to see it. The fact that you are more likely than 1/36 to see snake-eyes is kind of like the fact that if you are given the ticket of a random concert-goer, you have a higher chance of ending seeing a really popular band than if you just looked at the current proportion of shows performed by really popular bands.

It’s kind of like the fact that in your life you will spend more time waiting in long lines than short lines, and that on average your friends have more friends than you. This all seems counterintuitive and wrong until you think closely about the selection biases involved.

Anyway, I want to impress upon you that 90% really is the right answer, so I’ll throw some math at you. Let’s calculate in full detail what fraction of the group ends up surviving on average.

Screen Shot 2018-08-02 at 1.16.15 AM

By the way, the discrepancy between the baseline chance of death (1/36) and the anthropic chance of death (90%) can be made as large as you like by manipulating the starting problem. Suppose that instead of 1/36, the chance of the group dying was 1/100, and instead of the group multiplying by 10 in size each round, it grew by a factor of 100. Then the baseline chance of death would be 1%, and the anthropic probability would be 99%.

We can find the general formula for any such scenario:

Screen Shot 2018-08-02 at 4.54.30 AM.png

IF ANYBODY CAN SOLVE THIS, PLEASE TELL ME! I’ve been trying for too long now and would really like an analytic general solution. 🙂

There is a lot more to be said about this thought experiment, but I’ll leave it there for now. In the next post, I’ll present a slight variant on this thought experiment that appears to give us a way to get direct Bayesian evidence for different theories of consciousness! Stay tuned.

Anthropic argument for common priors

(Idea from Robin Hanson and Tyler Cowen’s 2004 paper Are Disagreements Honest?)

One common argument relating to common priors is that two rational agents with all the same information (including no information at all) could have no possible grounds on which to disagree. Priors by definition refer to the state of knowledge before either agent had any evidence relevant to a given proposition. So there is no information that either agent could have that would allow a difference in priors.

A response to this is that some information that we have is inherently private and unique to us. For instance, you and I might have differences in intelligence, in ways of conceptualizing the world, or in the things we innately find intuitively plausible. All of these differences may count as important information in shaping our priors on a given subject, before we ever encounter a single piece of evidence relevant to the subject.

Here’s a really weird argument for why even these differences should not count. If we use anthropic reasoning, and treat our own existence and the details of our brain and body as just another thing to be conditioned on, then even these private intimate details are simply contingent facts about the world that are to be treated as evidence. Before you’ve conditioned on your own existence, you should be agnostic as to which set of brain/body/mind out of all the possible sets of observers “you” will end up being. You must imagine yourself behind Rawls’ veil of ignorance, a disembodied reasoner that is identical to all other such reasoners. So there is no conceivable reason why your prior should differ from anybody else’s – you must treat yourself as literally the same entity as them pre anthropic conditioning.

In less out-there terms, if you encounter somebody with an apparently different prior from you, then you should consider “Hmm, what if I were born as this person, instead of myself?” The answer to which is, of course, you would have had the same priors as them. Which means that your difference in “priors” is actually a difference of posteriors resulting from conditioning on the arbitrary choice of body/brain/experiences you ended up with.

In addition, by Aumann’s agreement theorem, any apparent differences in priors that become common knowledge should quickly go away, once they are realized to be merely differences in posteriors. Essentially, any differences in priors that last between two rational individuals are signs that they are arbitrarily favoring their own existence in considerations of what prior they should use.