100 prisoners problem

I’m in the mood for puzzles, so here’s another one. This one is so good that it deserves its own post.

The setup (from wiki):

The director of a prison offers 100 death row prisoners, who are numbered from 1 to 100, a last chance. A room contains a cupboard with 100 drawers. The director randomly puts one prisoner’s number in each closed drawer. The prisoners enter the room, one after another. Each prisoner may open and look into 50 drawers in any order. The drawers are closed again afterwards.

If, during this search, every prisoner finds his number in one of the drawers, all prisoners are pardoned. If just one prisoner does not find his number, all prisoners die. Before the first prisoner enters the room, the prisoners may discuss strategy—but may not communicate once the first prisoner enters to look in the drawers. What is the prisoners’ best strategy?

Suppose that each prisoner selects 50 at random, and don’t coordinate with one another. Then the chance that any particular prisoner gets their own number is 50%. This means that the chance that all 100 get their own number is 1/2¹⁰⁰.

Let me emphasize how crazily small this is. 1/2100 is 1/1,267,650,600,228,229,401,496,703,205,376; less than one in a decillion. If there were 100 prisoners trying exactly this setup every millisecond, it would take them 40 billion billion years to get out alive once. This is 3 billion times longer than the age of the universe.

Okay, so that’s a bad strategy. Can we do better?

It’s hard to imagine how… While the prisoners can coordinate beforehand, they cannot share any information. So every time a prisoner comes in for their turn at the drawers, they are in exactly the same state of knowledge as if they hadn’t coordinated with the others.

Given this, how could we possibly increase the survival chance beyond 1/2¹⁰⁰?

(…)

(Try to answer for yourself before continuing)

(…)

Let’s consider a much simpler case. Imagine we have just two prisoners, two drawers, and each one can only open one of them. Now if both prisoners choose randomly, there’s only a 1 in 4 chance that they both survive.

What if they agree to open the same drawer? Then they have reduced their survival chance from 25% to 0%! Why? Because by choosing the same drawer, they either both get the number 1, or they both get the number 2. In either case, they are guaranteed that only one of them gets their own number.

So clearly the prisoners can decrease the survival probability by coordinating beforehand. Can they increase it?

Yes! Suppose that they agree to open different drawers. Then this doubles their survival chance from 25% to 50%. Either they both get their own number, or they both get the wrong number.

The key here is to minimize the overlap between the choices of the prisoners. Unfortunately, this sort of strategy doesn’t scale well. If we have four prisoners, each allowed to open two drawers, then random drawing gives a 1/16 survival chance.

Let’s say they open according to the following scheme: 12, 34, 13, 24 (first prisoner opens drawers 1 and 2, second opens 3 and 4, and so on). Then out of the 24 possible drawer layouts, the only layouts that work are 1432 and 3124:

1234 1243 1324 1342 1423 1432
2134 2143 2314 2341 2413 2431
3124 3142 3214 3241 3412 3421
4123 4132 4213 4231 4312 4321

This gives a 1/12 chance of survival, which is better but not by much.

What if instead they open according to the following scheme: (12, 23, 34, 14)?

1234 1243 1324 1342 1423 1432
2134 2143 2314 2341 2413 2431
3124 3142 3214 3241 3412 3421
4123 4132 4213 4231 4312 4321

Same thing: a 1/12 chance of survival.

Scaling this up to 100 prisoners, the odds of survival look pretty measly. Can they do better than this?

(…)

(Try to answer for yourself before continuing)

(…)

It turns out that yes, there is a strategy that does better at ensuring survival. In fact, it does so much better that the survival chance is over 30 percent!

Take a moment to boggle at this. Somehow we can leverage the dependency induced by the prisoners’ coordination to increase the chance of survival by a factor of one decillion, even though none of their states of knowledge are any different. It’s pretty shocking to me that this is possible.

Here’s the strategy: Each time a prisoner opens a drawer, they consult the number in that drawer to determine which drawer they will open next. Thus each prisoner only has to decide on the first drawer to open, and all the rest of the drawers follow from this. Importantly, the prisoner only knows the first drawer they’ll pick; the other 49 are determined by the distribution of numbers in the drawers.

We can think about each drawer as starting a chain through the other drawers. These chains always cycle back into the starting number, the longest possible cycle being 100 numbers and the shortest being 1. Now, each prisoner can guarantee that they are in a cycle that contains their own number by choosing the drawer corresponding to their own number!

So, the strategy is that Prisoner N starts by choosing Drawer N, looking at the number within, then choosing the drawer labeled with that number. Repeat 50 times per each prisoner.

The wiki page has a good description of how to calculate the survival probability with this strategy:

The prison director’s assignment of prisoner numbers to drawers can mathematically be described as a permutation of the numbers 1 to 100. Such a permutation is a one-to-one mapping of the set of natural numbers from 1 to 100 to itself. A sequence of numbers which after repeated application of the permutation returns to the first number is called a cycle of the permutation. Every permutation can be decomposed into disjoint cycles, that is, cycles which have no common elements.

In the initial problem, the 100 prisoners are successful if the longest cycle of the permutation has a length of at most 50. Their survival probability is therefore equal to the probability that a random permutation of the numbers 1 to 100 contains no cycle of length greater than 50. This probability is determined in the following.

A permutation of the numbers 1 to 100 can contain at most one cycle of length l>50″></span>. There are exactly <span class={\tbinom  {100}{l}} ways to select the numbers of such a cycle. Within this cycle, these numbers can be arranged in (l-1)! ways since there are {\displaystyle l-1} permutations to represent distinct cycles of length l because of cyclic symmetry. The remaining numbers can be arranged in (100-l)! ways. Therefore, the number of permutations of the numbers 1 to 100 with a cycle of length l>50″></span> is equal to</p><dl>
<dd><span class={\displaystyle {\binom {100}{l}}\cdot (l-1)!\cdot (100-l)!={\frac {100!}{l}}.}

The probability, that a (uniformly distributed) random permutation contains no cycle of length greater than 50 is calculated with the formula for single events and the formula for complementary events thus given by

Screen Shot 2018-07-04 at 5.44.43 AM

This is about 31.1828%.

This formula easily generalizes to other numbers of prisoners. We can plot the survival chance using this strategy as a function of the number of prisoners:

N prisoners

Amazingly, we can see that this value is asymptoting at a value greater than 30%. The precise limit is 1 – ln2 ≈ 30.685%.

In other words, no matter how many prisoners there are, we can always ensure that the survival probability is greater than 30%! This is pretty remarkable, and I think there are some deep lessons to be drawn from it, but I’m not sure what they are.

Some simple probability puzzles

(Most of these are taken from Ian Hacking’s Introduction to Probability and Inductive Logic.)

  1. About as many boys as girls are born in hospitals. Many babies are born every week at City General. In Cornwall, a country town, there is a small hospital where only a few babies are born every week.

    Define a normal week as one where between 45% and 55% of babies are female. An unusual week is one where more than 55% or less than 45% are girls.

    Which of the following is true:
    (a) Unusual weeks occur equally often at City General and at Cornwall.
    (b) Unusual weeks are more common at City General than at Cornwall.
    (c) Unusual weeks are more common at Cornwall than at City General.

  2. Pia is 31 years old, single, outspoken, and smart. She was a philosophy major. When a student, she was an ardent supporter of Native American rights, and she picketed a department store that had no facilities for nursing mothers.

    Which of the following statements are most probable? Which are least probable?

    (a) Pia is an active feminist.
    (b) Pia is a bank teller.
    (c) Pia works in a small bookstore.
    (d) Pia is a bank teller and an active feminist.
    (e) Pia is a bank teller and an active feminist who takes yoga classes.
    (f) Pia works in a small bookstore and is an active feminist who takes yoga classes.

  3. You have been called to jury duty in a town with only green and blue taxis. Green taxis dominate the market, with 85% of the taxis on the road.

    On a misty winter night a taxi sideswiped another car and drove off. A witness said it was a blue cab. This witness is tested under similar conditions, and gets the color right 80% of the time.

    You conclude about the sideswiping taxi:
    (a) The probability that it is blue is 80%.
    (b) It is probably blue, but with a lower probability than 80%.
    (c) It is equally likely to be blue or green.
    (d) It is more likely than not to be green.

  4. You are a physician. You think that it’s quite likely that a patient of yours has strep throat. You take five swabs from the throat of this patient and send them to a lab for testing.

    If the patient has strep throat, the lab results are right 70% of the time. If not, then the lab is right 90% of the time.

    The test results come back: YES, NO, NO, YES, NO

    You conclude:
    (a) The results are worthless.
    (b) It is likely that the patient does not have strep throat.
    (c) It is slightly more likely than not that the patient does have strep throat.
    (d) It is very much more likely than not that the patient does have strep throat.

  5. In a country, all families wants a boy. They keep having babies till a boy is born. What is the expected ratio of boys and girls in the country?
  6.  Answer the following series of questions:

    If you flip a fair coin twice, do you have the same chance of getting HH as you have of getting HT?

    If you flip the coin repeatedly until you get HH, does this result in the same average number of flips as if you repeat until you get HT?

    If you flip it repeatedly until either HH emerges or HT emerges, is either outcome equally likely?

    You play a game with a friend in which you each choose a sequence of three possible flips (e.g HHT and TTH). You then flip the coin repeatedly until one of the two patterns emerges, and whosever pattern it is wins the game. You get to see your friend’s choice of pattern before deciding yours. Are you ever able to bias the game in your favor?

    Are you always able to bias the game in your favor?

 

Solutions (and lessons)

  1. The correct answer is (a): Unusual weeks occur more often at Cornwall than at City General. Even though the chance of a boy is the same at Cornwall as it is at City General, the percentage of boys from week to week is larger in the smaller city (for N patients a week, the percentage boys goes like 1/sqrt(N)). Indeed, if you think about an extreme case where Cornwall has only one birth a week, then every week will be an unusual week (100% boys or 0% boys).
  2. There is room to debate the exact answer but whatever it is, it has to obey some constraints. Namely, the most probable statement cannot be (d), (e), or (f), and the least probable statement cannot be (a), (b), or (c). Why? Because of the conjunction rule of probability: each of (d), (e), and (f) are conjunctions of (a), (b), and (c), so they cannot be more likely. P(A & B) ≤ P(A).

    It turns out that most people violate this constraint. Many people answer that (f) is the most probable description, and (b) is the least probable. This result is commonly interpreted to reveal a cognitive bias known as the representativeness heuristic – essentially, that our judgements of likelihood are done by considering which descriptions most closely resemble the known facts. In this case,

    Another factor to consider is that prior to considering the evidence, your odds on a given person being a bank teller as opposed to working in a small bookstore should be heavily weighted towards her being a bank teller. There are just far more bank tellers than small bookstore workers (maybe a factor of around 20:1). This does not necessarily mean that (b) is more likely than (c), but it does mean that the evidence must discriminate strongly enough against her being a bank teller so as to overcome the prior odds.

    This leads us to another lesson, which is to not neglect the base rate. It is easy to ignore the prior odds when it feels like we have strong evidence (Pia’s age, her personality, her major, etc.). But the base rate on small bookstore workers and bank tellers are very relevant to the final judgement.

  3. The correct answer is (d) – it is more likely than not that the sideswiper was green. This is a basic case of base rate neglect – many people would see that the witness is right 80% of the time and conclude that the witness’s testimony has an 80% chance of being correct. But this is ignoring the prior odds on the content of the witness’s testimony.

    In this case, there were prior odds of 17:3 (85%:15%) in favor of the taxi being green. The evidence had a strength of 1:4 (20%:80%), resulting in the final odds being 17:12 in favor of the taxi being green. Translating from odds to probabilities, we get a roughly 59% chance of the taxi having been green.

    We could have concluded (d) very simply by just comparing the prior probability (85% for green) with the evidence (80% for blue), and noticing that the evidence would not be strong enough to make blue more likely than green (since 85% > 80%). Being able to very quickly translate between statistics and conclusions is a valuable skill to foster.

  4. The right answer is (d). We calculate this just like we did the last time:

    The results were YES, NO, NO, YES, NO.

    Each YES provides evidence with strength 7:1 (70%/10%) in favor of strep, and each NO provides evidence with strength 1:3 (30%/90%).

    So our strength of evidence is 7:1 ⋅ 1:3 ⋅ 1:3 ⋅ 7:1 ⋅ 1:3 = 49:27, or roughly 1.81:1 in favor of strep. This might be a little surprising… we got more NOs than YESs and the NO was correct 90% of the time for people without strep, compared to the YES being correct only 70% of the time in people with strep.

    Since the evidence is in favor of strep, and we started out already thinking that strep was quite likely, in the end we should be very convinced that they have strep. If our prior on the patient having strep was 75% (3:1 odds), then our probability after getting evidence will be 84% (49:9 odds).

    Again, surprising! The patient who sees these results and hears the doctor declaring that the test strengthens their belief that the patient has strep might feel that this is irrational and object to the conclusion. But the doctor would be right!

  5. Supposing as before that the chance of any given birth being a boy is equal to the chance of it being a girl, we end up concluding…

    The expected ratio of boys and girls in the country is 1! That is, this strategy doesn’t allow you to “cheat” – it has no impact at all on the ratio. Why? I’ll leave this one for you to figure out. Here’s a diagram for a hint:

    36666658_10216831977421805_8359037287605993472_n

    This is important because it applies to the problem of p-hacking. Imagine that all researchers just repeatedly do studies until they get the results they like, and only publish these results. Now suppose that all the researchers in the world are required to publish every study that they do. Now, can they still get a bias in favor of results they like? No! Even though they always stop when getting the result they like, the aggregate of their studies is unbiased evidence. They can’t game the system!

  6.  Answers, in order:

    If you flip a fair coin twice, do you have the same chance of getting HH as you have of getting HT? (Yes)

    If you flip it repeatedly until you get HH, does this result in the same average number of flips as if you repeat until you get HT? (No)

    If you flip it repeatedly until either HH emerges or HT emerges, is either outcome equally likely? (Yes)

    You play a game with a friend in which you each choose a sequence of three coin flips (e.g HHT and TTH). You then flip a coin repeatedly until one of the two patterns emerges, and whosever pattern it is wins the game. You get to see your friend’s choice of pattern before deciding yours. Are you ever able to bias the game in your favor? (Yes)

    Are you always able to bias the game in your favor? (Yes!)

    Here’s a wiki page with a good explanation of this: LINK. A table from that page illustrating a winning strategy for any choice your friend makes:

    1st player’s choice 2nd player’s choice Odds in favour of 2nd player
    HHH THH 7 to 1
    HHT THH 3 to 1
    HTH HHT 2 to 1
    HTT HHT 2 to 1
    THH TTH 2 to 1
    THT TTH 2 to 1
    TTH HTT 3 to 1
    TTT HTT 7 to 1

Deriving the Schrodinger equation

This video contains a really short and sweet derivation of the form of the Schrodinger equation from some fundamental principles. I want to present it here because I like it a lot.

I’m going to assume a lot of background knowledge of quantum mechanics for the purposes of this post, so as to keep it from getting too long. If you want to know more QM, I highly highly recommend Leonard Susskind’s online video lectures.

So! Very brief review of basic QM:

In quantum mechanics, the state of a system is described by a vector in a complex vector space. These vectors are all unit length, and encode all of the observable information about the system. The notation used for a state vector is |Ψ⟩, and is read as “the state vector psi”. By analogy with complex conjugation of numbers, you can also conjugate vectors. These conjugated vectors are written like ⟨φ|. Similarly, any operator A has a conjugate operator A*.

Inner products between vectors are expressed like ⟨φ|Ψ⟩, and represent the “closeness” between these states. If ⟨φ|Ψ⟩ = 0, then the states φ and Ψ are called orthogonal, and are as different as can be. (In particular, there is zero probability of either state being observed as the other.) And if ⟨φ|Ψ⟩ = 1, then the states are indistinguishable, and |Ψ⟩ =|φ⟩.

Now, we’re interested in the dynamics of quantum systems. How do they change in time?

Well, since we’re dealing with vectors, we can very generally suppose that there exists some operator that will take any state vector to the state vector that it evolves into after some amount of time t. Let’s just give this operator a name: U(t). We express the notion of U(t) as a time-evolution operator by writing

U(t)|Ψ(0)⟩ = |Ψ(t)⟩

In other words, take the state Ψ at time 0, apply the operator U(t) to it, and you get back the state Ψ at time t.

Now, what are some basic things we can say about the time-evolution operator?

First: if we evolve forwards in time by a length of time equal to zero, the state will not change. (This is basically definitional.)

I.e. U(0) = I (where I is the identity operator).

Second: Time evolution is always continuous, in that an evolution forwards by an arbitrarily small time period ε will change the state by an amount proportional to ε.

I.e. U(ε) = I + εG (where G is some other operator).

Third: Time evolution preserves orthogonality. If two states are ever orthogonal, then they are always orthogonal. (This is an assumption of conservation of information – the laws of physics don’t cause information to disappear or new information to pop up out of nowhere.)

I.e. ⟨φ(0)|Ψ(0)⟩ = 0  ⇒ ⟨φ(t)|Ψ(t)⟩ = 0

From this we can actually derive a stronger statement, which is that all inner products are conserved over time. (The intuition for this is that if all our orthogonal basis vectors stay orthogonal when we evolve forward in time, then time evolution is something like a rotation, and rotations preserve all inner products.)

I.e. ⟨φ(0)|Ψ(0)⟩ = ⟨φ(t)|Ψ(t)⟩

So our starting point is:

  1. U(t)|Ψ(0)⟩ = |Ψ(t)⟩
  2. U(0) = I
  3. U(ε) = I + εG
  4. ⟨φ(0)|Ψ(0)⟩ = ⟨φ(t)|Ψ(t)⟩

From (1) and (4), we get

U(t)|Ψ(0)⟩ = |Ψ(t)⟩
U(t)|φ(0)⟩ = |φ(t)⟩
so…
⟨φ(t)|Ψ(t)⟩ = ⟨φ(0)|U*(t) U(t)|Ψ(0)⟩
= ⟨φ(0)|Ψ(0)⟩
and therefore…
U*U = I

Operators that satisfy the identity on the final line this are called unitary – they are analogous to complex numbers of unit length.

Let’s use this identity together with (3):

U*(ε) U(ε) = I
(I + εG)(I + εG*) = I
I + ε(G + G*) + ε² G*G = I
ε(G + G*) + ε² G*G = 0
G + G* ≈ 0

In the last line, I’ve used the assumption that ε is arbitrarily small, so that we can throw out factors of ε².

Now, what does this final line tell us? Well, it says that the operator G (which dictates the change in state over an infinitesimal time) is purely imaginary. By analogy, any purely imaginary number y = ix satisfies the identity:

y + y* =  ix + (ix)* = ix – ix = 0

So if G is purely imaginary, it is convenient to consider a new purely real operator H = iG. This operator is Hermitian by construction – it is equal to its complex conjugate. Substituting this operator into our infinitesimal time evolution equation, we get

U(ε) = I – iεH

Now, let’s consider the derivative of a quantum state.

d|Ψ⟩/dt = (|Ψ(t + ε)⟩ – |Ψ(t)⟩) / ε
= (U(ε) – I)|Ψ(t)⟩ / ε
= -iεH|Ψ(t)⟩ / ε
= -iH|Ψ(t)⟩

Thus we get…

d/dt|Ψ⟩ = -iH|Ψ⟩

This is the time-dependent Schrodinger equation, although we haven’t yet specified what this operator H is supposed to be. However, since we know H is Hermitian, we also know that H corresponds to some observable quantity.

It turns out that if we multiply this operator by Planck’s constant ħ, it becomes the Hamiltonian – the operator that corresponds to the observable energy. We’ll just change notation subtly by taking H to be the Hamiltonian – that is, what we would previously have called ħH. Then we get the more familiar form of the time-dependent Schrodinger equation:

iħ d/dt|Ψ⟩ = H|Ψ⟩