An expected value puzzle

Consider the following game setup:

Each round of the game starts with you putting in all of your money. If you currently have $10, then you must put in all of it to play. Now a coin is flipped. If it lands heads, you get back 10 times what you put in ($100). If not, then you lose it all. You can keep playing this game until you have no more money.

What does a perfectly rational expected value reasoner do?

Supposing that they value money roughly linear with its quantity, then the expected value for putting in the money is always greater than 0. If you put in $X, then you stand a 50% chance of getting $10X back and a 50% chance of losing $X. Thus, your expected value is 5X – X/2 = 9X/2.

This means that the expected value reasoner would keep putting in their money until, eventually, they lose it all.

What’s wrong with this line of reasoning (if anything)? Does it serve as a reductio ad absurdum of expected value reasoning?

The Anthropic Dice Killer

Today we discuss anthropic reasoning.

The Problem

Imagine the following scenario:

A mad killer has locked you in a room. You are trapped and alone, with only your knowledge of your situation to help you out.

One piece of information that you have is that you are aware of the maniacal schemes of your captor. His plans began by capturing one random person. He then rolled a pair of dice to determine their fate. If the dice landed snake eyes (both 1), then the captive would be killed. If not, then they would be let free.

But if they are let free, the killer will search for new victims, and this time bring back ten new people and lock them alone in rooms. He will then determine their fate just as before, with a pair of dice. Snake eyes means they die, otherwise they will be let free and he will search for new victims.

His murder spree will continue until the first time he rolls snake eyes. Then he will kill the group that he currently has imprisoned and retire from the serial-killer life.

Now. You become aware of a risky way out of the room you are locked in and to freedom. The chances of surviving this escape route are only 50%. Your choices are thus either (1) to traverse the escape route with a 50% chance of survival or (2) to just wait for the killer to roll his dice, and hope that it doesn’t land snake eyes.

What should you do?




(Think about it before reading on)




A plausible-sounding answer

Your chance of dying if you stay and wait is just the chance that the dice lands snake eyes. The probability of snake eyes is just 1/36 (1/6 for each dice landing 1).

So your chance of death is only 1/36 (≈ 3%) if you wait, and it’s 50% if you try to run for it. Clearly, you are better off waiting!


You guessed it, things aren’t that easy. You have extra information about your situation besides just how the dice works, and you should use it. In particular, the killing pattern of your captor turns out to be very useful information.

Ask the following question: Out of all of the people that have been captured or will be captured at some point by this madman, how many of them will end up dying? This is just the very last group, which, incidentally, is the largest group.

Consider: if the dice land snake eyes the first time they are rolled, then only one person is ever captured, and this person dies. So the fraction of those captured that die is 100%.

If they lands snake eyes the second time they are rolled, then 11 people total are captured, 10 of whom die. So the fraction of those captured that die is 10/11, or ≈ 91%.

If it’s the third time, then 111 people total are captured, 100 of whom die. Now the fraction is just over 90%.

In general, no matter how many times the dice rolls before landing snake eyes, it always ends up that over 90% of those captured end up being in the last round, and thus end up dying.

So! This looks like bad news for you… you’ve been captured, and over 90% of those that are captured always die. Thus, your chance of death is guaranteed to be greater than 90%.

The escape route with a 50% survival chance is looking nicer now, right?

Wtf is this kind of reasoning??

What we just did is called anthropic reasoning. Anthropic reasoning really just means updating on all of the information available to you, including indexical information (information about your existence, age, location, and so on). In this case, the initial argument neglected the very crucial information that you are one of the people that were captured by the killer. When updating on this information, we get an answer that is very very different from what we started with. And in this life-or-death scenario, this is an important difference!

You might still feel hesitant about the answer we got. After all, if you expect a 90% chance of death, this means that you expect a 90% chance for the dice to land snake eyes. But it’s not that you think the dice are biased or anything… Isn’t this just blatantly contradictory?

This is a convincing-sounding rebuttal, but it’s subtly wrong. The key point is that even though the dice are fair, there is a selection bias in the results you are seeing. This selection bias amounts to the fact that when the dice inevitably lands snake-eyes, there are more people around to see it. The fact that you are more likely than 1/36 to see snake-eyes is kind of like the fact that if you are given the ticket of a random concert-goer, you have a higher chance of ending seeing a really popular band than if you just looked at the current proportion of shows performed by really popular bands.

It’s kind of like the fact that in your life you will spend more time waiting in long lines than short lines, and that on average your friends have more friends than you. This all seems counterintuitive and wrong until you think closely about the selection biases involved.

Anyway, I want to impress upon you that 90% really is the right answer, so I’ll throw some math at you. Let’s calculate in full detail what fraction of the group ends up surviving on average.

Screen Shot 2018-08-02 at 1.16.15 AM

By the way, the discrepancy between the baseline chance of death (1/36) and the anthropic chance of death (90%) can be made as large as you like by manipulating the starting problem. Suppose that instead of 1/36, the chance of the group dying was 1/100, and instead of the group multiplying by 10 in size each round, it grew by a factor of 100. Then the baseline chance of death would be 1%, and the anthropic probability would be 99%.

We can find the general formula for any such scenario:

Screen Shot 2018-08-02 at 4.54.30 AM.png

IF ANYBODY CAN SOLVE THIS, PLEASE TELL ME! I’ve been trying for too long now and would really like an analytic general solution. 🙂

There is a lot more to be said about this thought experiment, but I’ll leave it there for now. In the next post, I’ll present a slight variant on this thought experiment that appears to give us a way to get direct Bayesian evidence for different theories of consciousness! Stay tuned.

What do I find conceptually puzzling?

There are lots of things that I don’t know, like, say, what the birth rate in Sweden is or what the effect of poverty on IQ is. There are also lots of things that I find really confusing and hard to understand, like quantum field theory and monetary policy. There’s also a special category of things that I find conceptually puzzling. These things aren’t difficult to grasp because the facts about them are difficult to understand or require learning complicated jargon. Instead, they’re difficult to grasp because I suspect that I’m confused about the concepts in use.

This is a much deeper level of confusion. It can’t be adjudicated by just reading lots of facts about the subject matter. It requires philosophical reflection on the nature of these concepts, which can sometimes leave me totally confused about everything and grasping for the solid ground of mere factual ignorance.

As such, it feels like a big deal when something I’ve been conceptually puzzled about becomes clear. I want to compile a list for future reference of things that I’m currently conceptually puzzled about and things that I’ve become un-puzzled about. (This is not a complete list, but I believe it touches on the major themes.)

Things I’m conceptually puzzled about

What is the relationship between consciousness and physics?

I’ve written about this here.

Essentially, at this point every available viewpoint on consciousness seems wrong to me.

Eliminativism amounts to a denial of pretty much the only thing that we can be sure can’t be denied – that we are having conscious experiences. Physicalism entails the claim that facts about conscious experience can be derived from laws of physics, which is wrong as a matter of logic.

Dualism entails that the laws of physics by themselves cannot account for the behavior of the matter in our brains, which is wrong. And epiphenomenalism entails that our beliefs about our own conscious experience are almost certainly wrong, and are no better representations of our actual conscious experiences than random chance.

How do we make sense of decision theory if we deny libertarian free will?

Written about this here and here.

Decision theory is ultimately about finding the decision D that maximizes expected utility EU(D). But to do this calculation, we have to decide what the set of possible decisions we are searching is.

EU confusion

Make this set too large, and you end up getting fantastical and impossible results (like that the optimal decision is to snap your fingers and make the world into a utopia). Make it too small, and you end up getting underwhelming results (in the extreme case, you just get that the optimal decision is to do exactly what you are going to do, since this is the only thing you can do in a strictly deterministic world).

We want to find a nice middle ground between these two – a boundary where we can say “inside here the things that are actually possible for us to do, and outside are those that are not.” But any principled distinction between what’s in the set and what’s not must be based on some conception of some actions being “truly possible” to us, and others being truly impossible. I don’t know how to make this distinction in the absence of a robust conception of libertarian free will.

Are there objectively right choices of priors?

I’ve written about this here.

If you say no, then there are no objectively right answers to questions like “What should I believe given the evidence I have?” And if you say yes, then you have to deal with thought experiments like the cube problem, where any choice of priors looks arbitrary and unjustifiable.

(If you are going to be handed a cube, and all you know is that it has a volume less than 1 cm3, then setting maximum entropy priors over volumes gives different answers than setting maximum entropy priors over side areas or side lengths. This means that what qualifies as “maximally uncertain” depends on whether we frame our reasoning in terms of side length, areas, or cube volume. Other approaches besides MaxEnt have similar problems of concept dependence.)

How should we deal with infinities in decision theory?

I wrote about this here, here, here, and here.

The basic problem is that expected utility theory does great at delivering reasonable answers when the rewards are finite, but becomes wacky when the rewards become infinite. There are a huge amount of examples of this. For instance, in the St. Petersburg paradox, you are given the option to play a game with an infinite expected payout, suggesting that you should buy in to the game no matter how high the cost. You end up making obviously irrational choices, such as spending $1,000,000 on the hope that a fair coin will land heads 20 times in a row. Variants of this involve the inability of EU theory to distinguish between obviously better and worse bets that have infinite expected value.

And Pascal’s mugging is an even worse case. Roughly speaking, a person comes up to you and threatens you with infinite torture if you don’t submit to them and give them 20 dollars. Now, the probability that this threat is credible is surely tiny. But it is non-zero! (as long as you don’t think it is literally logically impossible for this threat to come true)

An infinite penalty times a finite probability is still an infinite expected penalty. So we stand to gain an infinite expected utility by just handing over the 20 dollars. This seems ridiculous, but I don’t know any reasonable formalization of decision theory that allows me to refute it.

Is causality fundamental?

Causality has been nicely formalized by Pearl’s probabilistic graphical models. This is a simple extension of probability theory, out of which naturally falls causality and counterfactuals.

One can use this framework to represent the states of fundamental particles and how they change over time and interact with one another. What I’m confused about is that in some ways of looking at it, the causal relations appear to be useful but un-fundamental constructs for the sake of easing calculations. In other ways of looking at it, causal relations are necessarily built into the structure of the world, and we can go out and empirically discover them. I don’t know which is right. (Sorry for the vagueness in this one – it’s confusing enough to me that I have trouble even precisely phrasing the dilemma).

How should we deal with the apparent dependence of inductive reasoning upon our choices of concepts?

I’ve written about this here. Beyond just the problem of concept-dependence in our choices of priors, there’s also the problem presented by the grue/bleen thought experiment.

This thought experiment proposes two new concepts: grue (= the set of things that are either green before 2100 or blue after 2100) and bleen (the inverse of grue). It then shows that if we reasoned in terms of grue and bleen, standard induction would have us concluding that all emeralds will suddenly turn blue after 2100. (We repeatedly observed them being grue before 2100, so we should conclude that they will be grue after 2100.)

In other words, choose the wrong concepts and induction breaks down. This is really disturbing – choices of concepts should be merely pragmatic matters! They shouldn’t function as fatal epistemic handicaps. And given that they appear to, we need to develop some criterion we can use to determine what concepts are good and what concepts are bad.

The trouble with this is that the only proposals I’ve seen for such a criterion reference the idea of concepts that “carve reality at its joints”; in other words, the world is composed of green and blue things, not grue and bleen things, so we should use the former rather than the latter. But this relies on the outcome of our inductive process to draw conclusions about the starting step on which this outcome depends!

I don’t know how to cash out “good choices of concepts” without ultimately reasoning circularly. I also don’t even know how to make sense of the idea of concepts being better or worse for more than merely pragmatic reasons.

How should we reason about self defeating beliefs?

The classic self-defeating belief is “This statement is a lie.” If you believe it, then you are compelled to disbelieve it, eliminating the need to believe it in the first place. Broadly speaking, self-defeating beliefs are those that undermine the justifications for belief in them.

Here’s an example that might actually apply in the real world: Black holes glow. The process of emission is known as Hawking radiation. In principle, any configuration of particles with a mass less than the black hole can be emitted from it. Larger configurations are less likely to be emitted, but even configurations such as a human brain have a non-zero probability of being emitted. Henceforth, we will call such configurations black hole brains.

Now, imagine discovering some cosmological evidence that the era in which life can naturally arise on planets circling stars is finite, and that after this era there will be an infinite stretch of time during which all that exists are black holes and their radiation. In such a universe, the expected number of black hole brains produced is infinite (a tiny finite probability multiplied by an infinite stretch of time), while the expected number of “ordinary” brains produced is finite (assuming a finite spatial extent as well).

What this means is that discovering this cosmological evidence should give you an extremely strong boost in credence that you are a black hole brain. (Simply because most brains in your exact situation are black hole brains.) But most black hole brains have completely unreliable beliefs about their environment! They are produced by a stochastic process which cares nothing for producing brains with reliable beliefs. So if you believe that you are a black hole brain, then you should suddenly doubt all of your experiences and beliefs. In particular, you have no reason to think that the cosmological evidence you received was veridical at all!

I don’t know how to deal with this. It seems perfectly possible to find evidence for a scenario that suggests that we are black hole brains (I’d say that we have already found such evidence, multiple times). But then it seems we have no way to rationally respond to this evidence! In fact, if we do a naive application of Bayes’ theorem here, we find that the probability of receiving any evidence in support of black hole brains to be 0!

So we have a few options. First, we could rule out any possible skeptical scenarios like black hole brains, as well as anything that could provide any amount of evidence for them (no matter how tiny). Or we could accept the possibility of such scenarios but face paralysis upon actually encountering evidence for them! Both of these seem clearly wrong, but I don’t know what else to do.

How should we reason about our own existence and indexical statements in general?

This is called anthropic reasoning. I haven’t written about it on this blog, but expect future posts on it.

A thought experiment: imagine a murderous psychopath who has decided to go on an unusual rampage. He will start by abducting one random person. He rolls a pair of dice, and kills the person if they land snake eyes (1, 1). If not, he lets them free and hunts down ten new people. Once again, he rolls his pair of die. If he gets snake eyes he kills all ten. Otherwise he frees them and kidnaps 100 new people. On and on until he eventually gets snake eyes, at which point his murder spree ends.

Now, you wake up and find that you have been abducted. You don’t know how many others have been abducted alongside you. The murderer is about to roll the dice. What is your chance of survival?

Your first thought might be that your chance of death is just the chance of both dice landing 1: 1/36. But think instead about the proportion of all people that are ever abducted by him that end up dying. This value ends up being roughly 90%! So once you condition upon the information that you have been captured, you end up being much more worried about your survival chance.

But at the same time, it seems really wrong to be watching the two dice tumble and internally thinking that there is a 90% chance that they land snake eyes. It’s as if you’re imagining that there’s some weird anthropic “force” pushing the dice towards snake eyes. There’s way more to say about this, but I’ll leave it for future posts.

Things I’ve become un-puzzled about

Newcomb’s problem – one box or two box?

To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly.

– Nozick, 1969

I’ve spent months and months being hopelessly puzzled about Newcomb’s problem. I now am convinced that there’s an unambiguous right answer, which is to take the one box. I wrote up a dialogue here explaining the justification for this choice.

In a few words, you should one-box because one-boxing makes it nearly certain that the simulation of you run by the predictor also one-boxed, thus making it nearly certain that you will get 1 million dollars. The dependence between your action and the simulation is not an ordinary causal dependence, nor even a spurious correlation – it is a logical dependence arising from the shared input-output structure. It is the same type of dependence that exists in the clone prisoner dilemma, where you can defect or cooperate with an individual you are assured is identical to you in every single way. When you take into account this logical dependence (also called subjunctive dependence), the answer is unambiguous: one-boxing is the way to go.

Summing up:

Things I remain conceptually confused about:

  • Consciousness
  • Decision theory & free will
  • Objective priors
  • Infinities in decision theory
  • Fundamentality of causality
  • Dependence of induction on concept choice
  • Self-defeating beliefs
  • Anthropic reasoning

The Monty Hall non-paradox

I recently showed the famous Monty Hall problem to a friend. This friend solved the problem right away, and we realized quickly that the standard presentation of the problem is highly misleading.

Here’s the setup as it was originally described in the magazine column that made it famous:

Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?

I encourage you to think through this problem for yourself and come to an answer. Will provide some blank space so that you don’t accidentally read ahead.







Now, the writer of the column was Marilyn vos Savant, famous for having an impossible IQ of 228 according to an interpretation of a test that violated “almost every rule imaginable concerning the meaning of IQs” (psychologist Alan Kaufman). In her response to the problem, she declared that switching gives you a 2/3 chance of winning the car, as opposed to a 1/3 chance for staying. She argued by analogy:

Yes; you should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you?

Notice that this answer contains a crucial detail that is not contained in the statement of the problem! Namely, the answer adds the stipulation that the host “knows what’s behind the doors and will always avoid the one with the prize.”

The original statement of the problem in no way implies this general statement about the host’s behavior. All you are justified to assume in an initial reading of the problem are the observational facts that (1) the host happened to open door No. 3, and (2) this door happened to contain a goat.

When nearly a thousand PhDs wrote in to the magazine explaining that her answer was wrong, she gave further arguments that failed to reference the crucial point; that her answer was only true given additional unstated assumptions.

My original answer is correct. But first, let me explain why your answer is wrong. The winning odds of 1/3 on the first choice can’t go up to 1/2 just because the host opens a losing door. To illustrate this, let’s say we play a shell game. You look away, and I put a pea under one of three shells. Then I ask you to put your finger on a shell. The odds that your choice contains a pea are 1/3, agreed? Then I simply lift up an empty shell from the remaining other two. As I can (and will) do this regardless of what you’ve chosen, we’ve learned nothing to allow us to revise the odds on the shell under your finger.

Notice that this argument is literally just a restatement of the original problem. If one didn’t buy the conclusion initially, restating it in terms of peas and shells is unlikely to do the trick!

This problem was made even more famous by this scene in the movie “21”, in which the protagonist demonstrates his brilliance by coming to the same conclusion as vos Savant. While the problem is stated slightly better in this scene, enough ambiguity still exists that the proper response should be that the problem is underspecified, or perhaps a set of different answers for different sets of auxiliary assumptions.

The wiki page on this ‘paradox’ describes it as a veridical paradox, “because the correct choice (that one should switch doors) is so counterintuitive it can seem absurd, but is nevertheless demonstrably true.”

Later on the page, we see the following:

In her book The Power of Logical Thinking, vos Savant (1996, p. 15) quotes cognitive psychologist Massimo Piattelli-Palmarini as saying that “no other statistical puzzle comes so close to fooling all the people all the time,” and “even Nobel physicists systematically give the wrong answer, and that they insist on it, and they are ready to berate in print those who propose the right answer.”

There’s something to be said about adequacy reasoning here; when thousands of PhDs and some of the most brilliant mathematicians in the world are making the same point, perhaps we are too quick to write it off as “Wow, look at the strength of this cognitive bias! Thank goodness I’m bright enough to see past it.”

In fact, the source of all of the confusion is fairly easy to understand, and I can demonstrate it in a few lines.

Solution to the problem as presented

Initially, all three doors are equally likely to contain the car.
So Pr(1) = Pr(2) = Pr(3) = ⅓

We are interested in how these probabilities update upon the observation that 3 does not contain the car.
Pr(1 | ~3) = Pr(1)・Pr(~3 | 1) / Pr(~3)
= (⅓ ・1) / ⅔ = ½

By the same argument,
Pr(2 | ~3) = ½

Voila. There’s the simple solution to the problem as it is presented, with no additional presumptions about the host’s behavior. Accepting this argument requires only accepting three premises:

(1) Initially all doors are equally likely to be hiding the car.

(2) Bayes’ rule.

(3) There is only one car.

(3) implies that Pr(the car is not behind a door | the car is behind a different door) = 100%, which we use when we replace Pr(~3 | 1) with 1.

The answer we get is perfectly obvious; in the end all you know is that the car is either in door 1 or door 2, and that you picked door 1 initially. Since which door you initially picked has nothing to do with which door the car was behind, and the host’s decision gives you no information favoring door 1 over door 2, the probabilities should be evenly split between the two.

It is also the answer that all the PhDs gave.

Now, why does taking into account the host’s decision process change things? Simply because the host’s decision is now contingent on your decision, as well as the actual location of the car. Given that you initially opened door 1, the host is guaranteed to not open door 1 for you, and is also guaranteed to not open up a door hiding the car.

Solution with specified host behavior

Initially, all three doors are equally likely to contain the car.
So Pr(1) = Pr(2) = Pr(3) = ⅓

We update these probabilities upon the observation that 3 does not contain the car, using the likelihood formulation of Bayes’ rule.

Pr(1 | open 3) / Pr(2 | open 3)
= Pr(1) / Pr(2)・Pr(open 3 | 1) / Pr(open 3 | 2)
= ⅓ / ⅓・½ / 1 = ½

So Pr(1 | open 3) = ⅓ and Pr(2 | open 3) = ⅔

Pr(open 3 | 2) = 1, because the host has no choice of which door to open if you have selected door 1 and the car is behind door 2.

Pr(open 3 | 1) = ½, because the host has a choice of either opening 2 or 3.

In fact, it’s worth pointing out that this requires another behavioral assumption about the host that is nowhere stated in the original post, or Savant’s solution. This is that if there is a choice about which of two doors to open, the host will pick randomly.

This assumption is again not obviously correct from the outset; perhaps the host chooses the larger of the two door numbers in such cases, or the one closer to themselves, or the one or the smaller number with 25% probability. There are an infinity of possible strategies the host could be using, and this particular strategy must be explicitly stipulated to get the answer that Wiki proclaims to be correct.

It’s also worth pointing out that once these additional assumptions are made explicit, the ⅓ answer is fairly obvious and not much of a paradox. If you know that the host is guaranteed to choose a door with a goat behind it, and not one with a car, then of course their decision about which door to open gives you information. It gives you information because it would have been less likely in the world where the car was under door 1 than in the world where the car was under door 2.

In terms of causal diagrams, the second formulation of the Monty Hall problem makes your initial choice of door and the location of the car dependent upon one another. There is a path of causal dependency that goes forwards from your decision to the host’s decision, which is conditioned upon, and then backward from the host’s decision to which door the car is behind.

Any unintuitiveness in this version of the Monty Hall problem is ultimately due to the unintuitiveness of the effects of conditioning on a common effect of two variables.

Monty Hall Causal

In summary, there is no paradox behind the Monty Hall problem, because there is no single Monty Hall problem. There are two different problems, each containing different assumptions, and each with different answers. The answers to each problem are fairly clear after a little thought, and the only appearance of a paradox comes from apparent disagreements between individuals that are actually just talking about different problems. There is no great surprise when ambiguous wording turns out multiple plausible solutions, it’s just surprising that so many people see something deeper than mere ambiguity here.

Taxonomy of infinity catastrophes for expected utility theory

Basics of expected utility theory

I’ve talked quite a bit in past posts about the problems that infinities raise for expected utility theory. In this post, I want to systematically go through and discuss the different categories of problems.

First of all, let’s define expected utility theory.

Given an action A, we have a utility function U over the possible consequences
U = { U1, U2, U3, … UN }
and a credence distribution P over the consequences
P = { P1, P2, P3, … PN }.
We define the expected utility of A to be EU(A) = P1U1 + P2U2 + … + PNUN

Expected Utility Theory:
The rational action is that which maximizes expected utility.

Just to give an example of how this works out, suppose that we can choose between two actions A1 and A2, defined as follows:

Action A1
U1 = { 20, -10 }
P1 = { 50%, 50% }

Action A2
U2 = { 10, -20 }
P2 = { 80%, 20% }

We can compare the expected utilities of these two actions by using the above formula.

EU(A1) = 20∙50% + -10∙50% = 5
EU(A2) = 10∙80% + -20∙20% = 4

Since EU(A1) is greater than EU(A2), expected utility theory mandates that A1 is the rational act for us to take.

Expected utility theory seems to work out fine in the case of finite payouts, but becomes strange when we begin to introduce infinities. Before even talking about the different problems that arise, though, you might be tempted to brush off this issue, thinking that infinite payouts don’t really exist in the real world.

While this is a tenable position to hold, it is certainly not obviously correct. We can easily construct games that are actually do-able that have an infinite expected payout. For instance, a friend of mine runs the following procedure whenever it is getting late and he is trying to decide whether or not he should head home: First, he flips a coin. If it lands heads, he heads home. If tails, he waits one minute and then re-flips the coin. If it lands heads this time, he heads home. If tails, then he waits two minutes and re-flips the coin. On the next flip, if it lands tails, he waits four minutes. Then eight. And so on. The danger of this procedure is that on overage, he ends up staying out for an infinitely long period of time.

This is a more dangerous real-world application of the St. Petersburg Paradox (although you’ll be glad to know that he hasn’t yet been stuck hanging out with me for an infinite amount of time). We might object: Yes, in theory this has an infinite expected time. But we know that in practice, there will be some cap on the total possible time. Perhaps this cap corresponds to the limit of tolerance that my friend has before he gives up on the game. Or, more conclusively, there is certainly an upper limit in terms of his life span.

Are there any real infinities out there that could translate into infinite utilities? Once again, plausibly no. But it doesn’t seem impossible that such infinities could arise. For instance, even if we wanted to map utilities onto positive-valence experiences and believed that there was a theoretical upper limit on the amount of positivity you could possible experience in a finite amount of time, we could still appeal to the possibility of an eternity of happiness. If God appeared before you and offered you an eternity of existence in a Heaven, then you would presumably be considering an offer with a net utility of positive infinity. Maybe you think this is implausible (I certainly do), but it is at least a possibility that we could be confronted with real infinities in expected utility calculations.

Reassured that infinite utilities are probably not a priori ruled out, we can now ask: How does expected utility theory handle these scenarios?

The answer is: not well.

There are three general classes of failures:

  1. Failure of dominance arguments
  2. Undefined expected utilities
  3. Nonsensical expected utilities

Failure of dominance arguments

A dominance argument is an argument that says that if the expected utility of one action is greater than the expected utility of another, no matter what is the case.

Here’s an example. Consider two lotteries: Lottery 1 and Lottery 2. Each one decides on whether a player wins or not by looking at some fixed random event (say, whether or not a radioactive atom decays within a fixed amount of time T), but the reward for winning differs. If the radioactive atom does decay within time T, then you would get $100,000 from Lottery 1 and $200,000 from Lottery 2. If it does not, then you lose $200 dollars from Lottery 1 and lose $100 dollars from Lottery 2. Now imagine that you can choose only one of these two lotteries.

To summarize: If the atom decays, then Lottery 1 gives you $100,000 less than Lottery 2. And if the atom doesn’t decay, then Lottery 1 charges you $100 more than Lottery 2.

In other words, no matter what ends up happing, you are better off choosing Lottery 2 than Lottery 1. This means that Lottery 2 dominates Lottery 1 as a strategy. There is no possible configuration of the world in which you would have been better off by choosing Lottery 1 than you were by Lottery 2, so this choice is essentially risk-free.

So we have the following general principle, which seems to follow nicely from a simple application of expected utility theory:

Dominance: If action A1 dominates action A2, then it is irrational to choose A2 over A1.

Amazingly, this straightforward and apparently obvious rule ends up failing us when we start to talk about infinite payoffs.

Consider the following setup:

Action 1
U = { ∞, 0 }
P = { .5, .5 }

Action 2
U = { ∞, 10 }
P = { .5, .5 }

Action 2 weakly dominates Action 1. This means that no matter what consequence ends up obtaining, we always end up either better off or equally well off if we take Action 2 than Action 1. But when we calculate the expected utilities…

EU(Action 1) = .5 ∙ ∞ + .5 ∙ 0 = ∞
EU(Action 2) = .5 ∙ ∞ + .5 ∙ 10 = ∞

… we find that the two actions are apparently equal in utility, so we should have no preference between them.

This is pretty bizarre. Imagine the following scenario: God is about to appear in front of you and ship you off to Heaven for an eternity of happiness. In the few minutes before he arrives, you are able to enjoy a wonderfully delicious-looking Klondike bar if you so choose. Obviously the rational thing to do is to eat the Klondike bar, right? Apparently not, according to expected utility theory. The additional little burst of pleasure you get fades into irrelevance as soon as the infinities enter the calculation.

Not only do infinities make us indifferent between two actions, one of which dominates the other, but they can even make us end up choosing actions that are clearly dominated! My favorite example of this is one that I’ve talked about earlier, featuring a recently deceased Donald Trump sitting in Limbo negotiating with God.

To briefly rehash this thought experiment, every day Donald Trump is given an offer by God that he spend one day in Hell and in reward get two days in Heaven afterwards. Each day, the rational choice is for Trump to take the offer, spending one more day in Hell before being able to receive his reward. But since he accepts the offer every day, he ends up always delaying his payout in Heaven, and therefore spends all of eternity in Hell, thinking that he’s making a great deal.

We can think of Trump’s reason for accepting each day as a simple expected utility calculation: U(2 days in Heaven) + U(1 day in Hell) > 0. But iterating this decision an infinity of times ends up leaving Trump in the worst possible scenario – eternal torture.

Undefined expected utilities

Now suppose that you get the following deal from God: Either (Option 1) you die and stop existing (suppose this has utility 0 to you), or (Option 2) you die and continue existing in the afterlife forever. If you choose the afterlife, then your schedule will be arranged as follows: 1,000 days of pure bliss in heaven, then one day of misery in hell. Suppose that each day of bliss has finite positive value to you, and each day of misery has finite negative value to you, and that these two values perfectly cancel each other out (a day in Hell is as bad as a day in Heaven is good).

Which option should you take? It seems reasonable that Option 2 is preferable, as you get a thousand to one ratio of happiness to unhappiness for all of eternity.

Option 1: 💀, 💀, 💀, 💀, 
Option 2:
😇 x 1000, 😟, 😇 x 1000, 😟, …

Since U(💀) = 0, we can calculate the expected utility of Option 1 fine. But what about Option 2? The answer we get depends on the order in which we add up the utilities of each day. If we take the days in chronological order, than we get a total infinite positive utility. If we alternate between Heaven days and Hell days, then we get a total expected utility of zero. And if we add up in the order (Hell, Hell, Heaven, Hell, Hell, Heaven, …), then we end up getting an infinite negative expected utility.

In other words, the expected utility of Option 2 is undefined, giving us no guidance as to which we should prefer. Intuitively, we would want a rational theory of preference would tell us that Option 2 is preferable.

A slightly different example of this: Consider the following three lotteries:

Lottery 1
U = { ∞, -∞ }
P = { .5, .5 }

Lottery 2
U = { ∞, -∞ }
P = { .01, .99 }

Lottery 3
U = { ∞, -∞ }
P = { .99, .01 }

Lottery 1 corresponds to flipping a fair coin to determine whether you go to Heaven forever or Hell forever. Lottery 2 corresponds to picking a number between 1 and 100 to decide. And Lottery 3 corresponds to getting to pick 99 numbers between 1 and 100 to decide. It should be obvious that if you were in this situation, then you should prefer Lottery 3 over Lottery 1, and Lottery 1 over Lottery 2. But here, again, expected utility theory fails us. None of these lotteries have defined expected utilities, because ∞ – ∞ is not well defined.

Nonsensical expected utilities

A stranger approaches you and demands twenty bucks, on pain of an eternity of torture. What should you do?

Expected utility theory tells us that as long as we have some non-zero credence in this person’s threat being credible, then we should hand over the twenty bucks. After all, a small but nonzero probability multiplied by -∞ is still just -∞.

Should we have a non-zero credence in the threat being credible? Plausibly so. To have a zero credence in the threat’s credibility is to imagine that there is no possible evidence that could make it any more likely. It is true that no experience you could have would make the threat any more credible? What if he demonstrated incredible control over

In the end, we have an inconsistent triad.

  1. The rational thing to do is that which maximizes expected utility.
  2. There is a nonzero chance that the stranger threatening you with eternal torture is actually able to follow through on this threat.
  3. It is irrational to hand over the five dollars to the stranger.

This is a rephrasing of Pascal’s wager, but without the same problems as that thought experiment.

Pascal’s mugging

  • You should make decisions by evaluating the expected utilities of your various options and choosing the largest one.

This is a pretty standard and uncontroversial idea. There is room for controversy about how to fill in the details about how to evaluate expected utilities, but this basic premise is hard to argue against. So let’s argue against it!

Suppose that a stranger walks up to you in the street and says to you “I have been wired in from outside the simulation to give you the following message: If you don’t hand over five dollars to me right now, your simulator will teleport you to a dungeon and torture you for all eternity.” What should you do?

The obviously correct answer is that you should chuckle, continue on with your day, and laugh about the incident later on with your friends.

The answer you get from a simple application of decision theory is that as long as you aren’t absolutely, 100% sure that they are wrong, you should give them the five dollars. And you should definitely not be 100% sure. Why?

Suppose that the stranger says next: “I know that you’re probably skeptical about the whole simulation business, so here’s some evidence. Say any word that you please, and I will instantly reshape the clouds in the sky into that word.” You do so, and sure enough the clouds reshape themselves. Would this push your credences around a little? If so, then you didn’t start at 100%. Truly certain beliefs are those that can’t be budged by any evidence whatsoever. You can never update downwards on truly certain beliefs, by the definition of ‘truly certain’.

To go more extreme, just suppose that they demonstrate to you that they’re telling you the truth by teleporting you to a dungeon for five minutes of torture, and then bringing you back to your starting spot. If you would even slightly update your beliefs about their credibility in this scenario, then you had a non-zero credence in their credibility from the start.

And after all, this makes sense. You should only have complete confidence in the falsity of logical contradictions, and it’s not literally logically impossible that we are in a simulation, or that the simulator decides to mess with our heads in this bizarre way.

Okay, so you have a nonzero credence in their ability to do what they say they can do. And any nonzero credence, no matter how tiny, will result in the rational choice being to hand over the $5. After all, if expected utility is just calculated by summing up utilities weighted by probabilities, then you have something like the following:

EU(keep $5) – EU(give $5) = ε · U(infinite torture) – U(keep $5)
where ε = P(infinite torture | keep $5) – P(infinite torture | give $5)

As long as losing $5 isn’t infinitely bad to you, you should hand over the money. This seems like a problem, either for our intuitions or for decision theory.


So here are four propositions, and you must reject at least one of them:

  1. There is a nonzero chance of the stranger’s threat being credible.
  2. Infinite torture is infinitely worse than losing $5.
  3. The rational thing to do is that which maximizes expected utility.
  4. It is irrational to give the stranger $5.

I’ve already argued for (1), and (2) seems virtually definitional. So our choice is between (3) and (4). In other words, we either abandon the principle of maximizing expected utility as a guide to instrumental rationality, or we reject our intuitive confidence in the correctness of (4).

Maybe at this point you feel more willing to accept (4). After all, intuitions are just intuitions, and humans are known to be bad at reasoning about very small probabilities and very large numbers. Maybe it actually makes sense to hand over the $5.

But consider where this line of reasoning leads.

The exact same argument should lead you to give in to any demand that the stranger makes of you, as long as it doesn’t have a literal negative infinity utility value. So if the stranger tells you to hand over your car keys, to go dance around naked in a public square, or to commit heinous crimes… all of these behaviors would be apparently rationally mandated.

Maybe, maybe, you might be willing to bite the bullet and say that yes, these behaviors are all perfectly rational, because of the tiny chance that this stranger is telling the truth. I’d still be willing to bet that you wouldn’t actually behave in this self-professedly “rational” manner if I now made this threat to you.

Also, notice that this dilemma is almost identical to Pascal’s wager. If you buy the argument here, then you should also be doing all that you can to ensure that you stay out of Hell. If you’re queasy about the infinities and think decision theory shouldn’t be messing around with such things, then we can easily modify the problem.

Instead of “your simulator will teleport you to a dungeon and torture you for all eternity”, make it “your simulator will teleport you to a dungeon and torture you for 3↑↑↑↑3 years.” The negative utility of this is large enough as to outweigh any reasonable credence you could place in the credibility of the threat. And if it isn’t, we can just make the number of years even larger.

Maybe the probability of a given payout scales inversely with the size of the payout? But this seems fairly arbitrary. Is it really the case that the ability to torture you for 3↑↑↑↑3 years is twice as likely as the ability to torture you for 2 ∙ 3↑↑↑↑3 years? I can’t imagine why. It seems like the probability of these are going to be roughly equal – essentially, once you buy into the prospect of a simulator that is able to torture you for 3↑↑↑↑3 years, you’ve already basically bought into the prospect that they are able to torture you for twice that amount of time.

All we’re left with is to throw our hands up and say “I can’t explain why this argument is wrong, and I don’t know how decision theory has gone wrong here, but I just know that it’s wrong. There is no way that the actually rational thing to do is to allow myself to get mugged by anybody that has heard of Pascal’s wager.”

In other words, it seems like the correct response to Pascal’s mugging is to reject (3) and deny the expected-utility-maximizing approach to decision theory. The natural next question is: If expected utility maximization has failed us, then what should replace it? And how would it deal with Pascal’s mugging scenarios? I would love to see suggestions in the comments, but I suspect that this is a question that we are simply not advanced enough to satisfactorily answer yet.

Timeless decision theory and homogeneity

Something that seems difficult to me about timeless decision theory is how to reason in a world where most people are not TDTists. In such a world, it seems like the subjunctive dependence between you and others gets weaker and weaker the more TDT influences your decision process.

Suppose you are deciding whether or not to vote. You think through all of the standard arguments you know of: your single vote is virtually guaranteed to not swing the election, so the causal effect of your vote is essentially nothing; the cost to you of voting is tiny, or maybe even positive if you go with a friend and show off your “I Voted” sticker all day;  if you vote, you might be able to persuade others to vote as well; etc. At the end of your pondering, you decide that it’s overall not worth it to vote.

Now a TDTist pops up behind your shoulder and says to you: “Look, think about all the other people out there reasoning similarly to you. If you end up not voting as a result of this reasoning, then it’s pretty likely that they’ll all not vote as well. On the other hand, if you do end up voting, then they probably will vote too! So instead of treating your decision as if it only makes the world 1 vote different, you should treat it is if it influences all the votes of those sufficiently similar to you.”

 Maybe you instantly find this convincing, and decide to go to the voting booth right away. But the problem is that in taking into account this extra argument, you have radically reduced the set of people whose overall reasoning process is similar to you!

This set was initially everybody that had thought through all the similar arguments and felt similarly to you about them, and most of these people ended up not voting. But as soon as the TDTist popped up and presented their argument, the set of people that were subjunctively dependent upon you shrunk to just those in the initial set that had also heard this argument.

In a world in which only a single person ever had thought about subjunctive dependence, and this person was not going to vote before thinking about it, the evidential effect of not voting is basically zero. Given this, the argument would have no sway on them.

This seems like it would weaken the TDTist’s case that TDTists do better in real world problems. At the same time, it seems actually right. In a case where very few people follow the same reasoning processes as you, your decisions tell you very little about the decisions of others, for the same reason that a highly neuro-atypical person should be hesitant to generalize information about their brain to other people.

Another conclusion of this is that timeless decision theory is most powerful in a community where there is homogeneity of thought and information. Propagation of the idea of timeless decision theory would amplify the coordination-inducing power of the procedure.

I’m not sure if this implies that a TDTist is motivated to spread the idea and homogenize their society, as doing so increases subjunctive dependence and thus enhances their influence. I’d guess that they would only reason this way if they thought themselves to be above average in decision-making, or to have information that others don’t, so that the expected utility of them having increased decision-making ability would outweigh the costs of homogeneity.