The Central Paradox of Statistical Mechanics: The Problem of The Past

This is the third part in a three-part series on the foundations of statistical mechanics.

  1. The Necessity of Statistical Mechanics for Getting Macro From Micro
  2. Is The Fundamental Postulate of Statistical Mechanics A Priori?
  3. The Central Paradox of Statistical Mechanics: The Problem of The Past

— — —

What I’ve argued for so far is the following set of claims:

  1. To successfully predict the behavior of macroscopic systems, we need something above and beyond the microphysical laws.
  2. This extra thing we need is the fundamental postulate of statistical mechanics, which assigns a uniform distribution over the region of phase space consistent with what you know about the system. This postulate allows us to prove all the things we want to say about the future, such as “gases expand”, “ice cubes melt”, “people age” and so on.
  3. This fundamental postulate is not justifiable on a priori grounds, as it is fundamentally an empirical claim about how frequently different micro states pop up in our universe. Different initial conditions give rise to different such frequencies, so that a claim to a priori access to the fundamental postulate is a claim to a priori access to the precise details of the initial condition of the universe.

 There’s just one problem with all this… apply our postulate to the past, and everything breaks.

 Notice that I said that the fundamental postulate allows us to prove all the things we want to say about the future. That wording was chosen carefully. What happens if you try to apply the microphysical laws + the fundamental postulate to predict the past of some macroscopic system? It turns out that all hell breaks loose. Gases spontaneously contract, ice cubes form from puddles of water, and brains pop out of thermal equilibrium.

 Why does this happen? Very simply, we start with two fully time reversible premises (the microphysical laws and the fundamental postulate). We apply it to present knowledge of some state, the description of which does not specify a special time direction. So any conclusion we get must as a matter of logic be time reversible as well! You can’t start with premises that treat the past as the mirror image of the future, and using just the rules of logical equivalence derive a conclusion that treats the past as fundamentally different from the future. And what this means is that if you conclude that entropy increases towards the future, then you must also conclude that entropy increases towards the past. Which is to say that we came from a higher entropy state, and ultimately (over a long enough time scale and insofar as you think that our universe is headed to thermal equilibrium) from thermal equilibrium.

Let’s flesh this argument out a little more. Consider a half-melted ice cube sitting in the sun. The microphysical laws + the fundamental postulate tell us that the region of phase space consisting of states in which the ice cube is entirely melted is much much much larger than the region of phase space in which it is fully unmelted. So much larger, in fact, that it’s hard to express using ordinary English words. This is why we conclude that any trajectory through phase space that passes through the present state of the system (the half-melted cube) is almost certainly going to quickly move towards the regions of phase space in which the cube is fully melted. But for the exact same reason, if we look at the set of trajectories that pass through the present state of the system, the vast vast vast majority of them will have come from the fully-melted regions of phase space. And what this means is that the inevitable result of our calculation of the ice cube’s history will be that a few moments ago it was a puddle of water, and then it spontaneously solidified and formed into a half-melted ice cube.

This argument generalizes! What’s the most likely past history of you, according to statistical mechanics? It’s not that the solar system coalesced from a haze of gases strewn through space by a past supernova, such that a planet would form in the Goldilocks zone and develop life, which would then gradually evolve through natural selection to the point where you are sitting in whatever room you’re sitting in reading this post. This trajectory through phase space is enormously unlikely. The much much much more likely past trajectory of you through phase space is that a little while ago you were a bunch of particles dispersed through a universe at thermal equilibrium, which happened to spontaneously coalesce into a brain that has time to register a few moments of experience before dissipating back into chaos. “What about all of my memories of the past?” you say. As it happens the most likely explanation of these memories is not that they are veridical copies of real happenings in the universe but illusions, manufactured from randomness.

Basically, if you buy everything I’ve argued in the first two parts, then you are forced to conclude that the universe is most likely near thermal equilibrium, with your current experience of it arising as a spontaneous dip in entropy, just enough to produce a conscious brain but no more. There are at least two big problems with this view.

Problem 1: This conclusion is, we think, extremely empirically wrong! The ice cube in front of you didn’t spontaneously form from a puddle of water, uncracked eggs weren’t a moment ago scrambled, and your memories are to some degree veridical. If you really believe that you are merely a spontaneous dip in entropy, then your prediction for the next minute will be the gradual dissolution of your brain and loss of consciousness. Now, wait a minute and see if this happens. Still here? Good!

Problem 2: The conclusion cannot be simultaneously believed and justified. If you think that you’re a thermal fluctuation, then you shouldn’t credit any of your memories as telling you anything about the world. But then your whole justification to coming to the conclusion in the first place (the experiments that led us to conclude that physics is time-reversible and that the fundamental postulate is true) is undermined! Either you believe it without justification, or you don’t believe despite justification. Said another way, no reflective equilibrium exists at an entropy minimum. David Albert calls this peculiar epistemic state cognitively unstable, as it’s not clear where exactly it should leave you.

Reflect for a moment on how strange of a situation we are in here. Starting from very basic observations of the world, involving its time-reversibility on the micro scale and the increase in entropy of systems, we see that we are inevitably led to the conclusion that we are almost certainly thermal fluctuations, brains popping out of the void. I promise you that no trick has been pulled here, this really is the state of the philosophy of statistical mechanics! The big issue is how to deal with this strange situation.

One approach is to say the following: Our problem is that our predictions work towards the future but not the past. So suppose that we simply add as a new fundamental postulate the proposition that long long ago the universe had an incredibly low entropy. That is, suppose that instead of just starting with the microphysical laws and the fundamental postulate of statistical mechanics, we added a third claims: the Past Hypothesis.

The Past Hypothesis should be understood as an augmentation of our Fundamental Postulate. Taken together, the two postulates say that our probability distribution over possible microstates should not be uniform over phase space. Instead, it should be what you get when you take the uniform distribution, and then condition on the distant past being extremely low entropy. This process of conditioning clearly preferences one direction of time over the other, and so the symmetry is broken.

 It’s worth reflecting for a moment on the strangeness of the epistemic status of the Past Hypothesis. It happens that we have over time accumulated a ton of observational evidence for the occurrence of the Big Bang. But none of this evidence has anything to do with our reasons for accepting the Past Hypothesis. If we buy the whole line of argument so far, our conclusion that something like a Big Bang occurred becomes something that we are forced to believe for deep logical reasons, on pain of cognitive instability and self-undermining belief. Anybody that denies that the Big Bang (or some similar enormously low-entropy past state) occurred has to contend with their view collapsing in self-contradiction upon observing the physical laws!

Is The Fundamental Postulate of Statistical Mechanics A Priori?

This is the second part in a three-part series on the foundations of statistical mechanics.

  1. The Necessity of Statistical Mechanics for Getting Macro From Micro
  2. Is The Fundamental Postulate of Statistical Mechanics A Priori?
  3. The Central Paradox of Statistical Mechanics: The Problem of The Past

— — —

The fantastic empirical success of the fundamental postulate gives us a great amount of assurance that the postulate is good one. But it’s worth asking whether that’s the only reason that we should like this postulate, or if it has some solid a priori justification. The basic principle of “when you’re unsure, just distribute credences evenly over phase space” certainly strikes many people as highly intuitive and justifiable on a priori grounds. But there are some huge problems with this way of thinking, one of which I’ve already hinted at. Here’s a thought experiment that illustrates the problem.

There is a factory in your town that produces cubic boxes. All you know about this factory is that the boxes that they produce all have a volume between 0 m3 and 1 m3. You are going to be delivered a box produced by this factory, and are asked to represent your state of knowledge about the box with a probability distribution. What distribution should you use?

Suppose you say “I should be indifferent over all the possible boxes. So I should have a uniform distribution over the volumes from 0 m3 to 1 m3.” This might seem reasonable at first blush. But what if somebody else said “Yes, you should be indifferent over all the possible boxes, but actually the uniform distribution should be over the side lengths from 0 m to 1 m, not volumes.” This would be a very different probability distribution! For example, if the probability that the side length is greater than .5 m is 50%, then the probability that the volume is greater than (.5)3 = 1/8 is also 50%! Uniform over side length is not the same as uniform over volume (or surface area, for that matter). Now, how do you choose between a uniform distribution over volumes and a uniform distribution over side lengths? After all, you know nothing about the process that the factory is using to produce the boxes, and whether it is based off of volume or side length (or something else); all you know is that all boxes are between 0 m3 and 1 m3.

The lesson of this thought experiment is that the statement we started with (“I should be indifferent over all possible boxes”) was actually not even well-defined. There’s not just one unique measure over a continuous space, and in general the notion that “all possibilities are equally likely” is highly language-dependent.

The exact same applies to phase space, as position and momentum are continuous quantities. Imagine that somebody instead of talking about phase space, only talked about “craze space”, in which all positions become positions cubed, and all momentum values become natural logs of momentum. This space would still contain all possible microstates of your system. What’s more, the fundamental laws of nature could be rewritten in a way that uses only craze space quantities, not phase space quantities. And needless to say, being indifferent over phase space would not be the same as being indifferent over craze space.

Spend enough time looking at attempts to justify a unique interpretation of the statement “All states are equally likely”, when your space of states is a continuous infinity, and you’ll realize that all such attempts are deeply dependent upon arbitrary choices of language. The maximum information entropy probability distribution is afflicted with the exact same problem, because the entropy of your distribution is going to depend on the language you’re using to describe it! The entropy of a distribution in phase space is NOT the same as the entropy of the equivalent distribution transformed to craze space.

Let’s summarize this section. If somebody tells you that the fundamental postulate says that all microstates compatible with what you know about the macroscopic features of your system are equally likely, the proper response is something like “Equally likely? That sounds like you’re talking about a uniform distribution. But uniform over what? Oh, position and momentum? Well, why’d you make that choice?” And if they point out that the laws of physics are expressed in terms of position and momentum, you just disagree and say “No, actually I prefer writing the laws of physics in terms of position cubed and log momentum!” (Substitute in any choice of monotonic functions).

If they object on the grounds of simplicity, point out that position and momentum are only simple as measured from a standpoint that takes them to be the fundamental concepts, and that from your perspective, getting position and momentum requires applying complicated inverse transformations to your monotonic transformation of the chosen coordinates.

And if they object on the grounds of naturalness, the right response is probably something like “Tell me more about this ’naturalness’. How do you know what’s natural or unnatural? It seems to me that your choice of what physical concepts count as natural is a manifestation of deep selection pressures that push any beings whose survival depends on modeling and manipulating their surroundings towards forming an empirically accurate model of the macroscopic world. So that when you say that position is more natural than log(position), what I hear is that the fundamental postulate is a very useful tool. And you can’t use the naturalness of the choice of position to justify the fundamental postulate, when your perception of the naturalness of position is the result of the empirical success of the fundamental postulate!”

In my judgement, none of the a priori arguments work, and fundamentally the reason is that the fundamental postulate is an empirical claim. There’s no a priori principle of rationality that tells us that boxes of gases tend to equilibrate, because you can construct a universe whose initial microstate is such that its entire history is one of entropy radically decreasing, gases concentrating, eggs unscrambling, ice cubes unmelting, and so on. Why is this possible? Because it’s consistent with the microphysical laws that the universe started in an enormously low entropy configuration, so it’s gotta also be consistent with the microphysical laws for the entire universe to spend its entire lifetime decreasing in entropy. The general principle is: If you believe that something is physically possible, then you should believe its time-inverse is possible as well.

Let’s pause and take stock. What I’ve argued for so far is the following set of claims:

  1. To successfully predict the behavior of macroscopic systems, we need something above and beyond the microphysical laws.
  2. This extra thing we need is the fundamental postulate of statistical mechanics, which assigns a uniform distribution over the region of phase space consistent with what you know about the system. This postulate allows us to prove all the things we want to say about the future, such as “gases expand”, “ice cubes melt”, “people age” and so on.
  3. This fundamental postulate is not justifiable on a priori grounds, as it is fundamentally an empirical claim about how frequently different microstates pop up in our universe. Different initial conditions give rise to different such frequencies, so that a claim to a priori access to the fundamental postulate is a claim to a priori access to the precise details of the initial condition of the universe.

There’s just one problem with all this… apply our postulate to the past, and everything breaks.

Up next: Why does statistical mechanics give crazy answers about the past? Where did we go wrong?

The Necessity of Statistical Mechanics for Getting Macro From Micro

This is the first part in a three-part series on the foundations of statistical mechanics.

  1. The Necessity of Statistical Mechanics for Getting Macro From Micro
  2. Is The Fundamental Postulate of Statistical Mechanics A Priori?
  3. The Central Paradox of Statistical Mechanics: The Problem of The Past

— — —

Let’s start this out with a thought experiment. Imagine that you have access to the exact fundamental laws of physics. Suppose further that you have unlimited computing power, for instance, you have an oracle that can instantly complete any computable task. What then do you know about the world?

The tempting answer: Everything! But of course, upon further consideration, you are missing a crucial ingredient: the initial conditions of the universe. The laws themselves aren’t enough to tell you about your universe, as many different universes are compatible with the laws. By specifying the state of the universe at any one time (which incidentally does not have to be an “initial” time), though, you should be able to narrow down this set of compatible universes. So let’s amend our question:

Suppose that you have unlimited computing power, that you know the exact microphysical laws, and that you know the state of the universe at some moment. Then what do you know about the world?

The answer is: It depends! What exactly do you know about the state of the universe? Do you know it’s exact microstate? As in, do you know the position and momentum of every single particle in the universe? If so, then yes, the entire past and future of the universe are accessible to you. But suppose that instead of knowing the exact microstate, you only have access to a macroscopic description of the universe. For example, maybe you have a temperature map as well as a particle density function over the universe. Or perhaps you know the exact states of some particles, just not all of them.

Well, if you only have access to the macrostate of the system (which, notice, is the epistemic situation that we find ourselves in, being that full access to the exact microstate of the universe is as technologically remote as can be), then it should be clear that you can’t specify the exact microstate at all other times. This is nothing too surprising or interesting… starting with imperfect knowledge you will not arrive at perfect knowledge. But we might hope that in the absence of a full description of the microstate of the universe at all other times, you could at least give a detailed macroscopic description of the universe at other times.

That is, here’s what seems like a reasonable expectation: If I had infinite computational power, knew the exact microphysical laws, and knew, say, that a closed box was occupied by a cloud of noninteracting gas in its corner, then I should be able to draw the conclusion that “The gas will disperse.” Or, if I knew that an ice cube was sitting outdoors on a table in the sun, then I should be able to apply my knowledge of microphysics to conclude that “The ice cube will melt”. And we’d hope that in addition to being able to make statements like these, we’d also be able to produce precise predictions for how long it would take for the gas to become uniformly distributed over the box, or for how long it would take for the ice cube to melt.

Here is the interesting and surprising bit. It turns out that this is in principle impossible to do. Just the exact microphysical laws and an infinity of computing power is not enough to do the job! In fact, the microphysical laws will in general tell us almost nothing about the future evolution or past history of macroscopic systems!

Take this in for a moment. You might not believe me (especially if you’re a physicist). For one thing, we don’t know the exact form of the microphysical laws. It would seem that such a bold statement about their insufficiencies would require us to at least first know what they are, right? No, it turns out that the statement that microphysics is is far too weak to tell us about the behavior of macroscopic systems holds for an enormously large class of possible laws of physics, a class that we are very sure that our universe belongs to.

Let’s prove this. We start out with the following observation that will be familiar to physicists: the microphysical laws appear to be time-reversible. That is, it appears to be the case that for every possible evolution of a system compatible with the laws of physics, the time-reverse of that evolution (obtained by simply reversing the trajectories of all particles) is also perfectly compatible with the laws of physics.*

This is surprising! Doesn’t it seem like there are trajectories that are physically possible for particles to take, such that their time reverse is physically impossible? Doesn’t it seem like classical mechanics would say that a ball sitting on the ground couldn’t suddenly bounce up to your hand? An egg unscramble? A gas collect in the corner of a room? The answer to all of the above is no. Classical mechanics, and fundamental physics in general, admits the possibilities of all these things. A fun puzzle for you is to think about why the first example (the ball initially at rest on the ground bouncing up higher and higher until it comes to rest in your hand) is not a violation of the conservation of energy.

Now here’s the argument: Suppose that you have a box that you know is filled with an ideal gas at equilibrium (uniformly spread through the volume). There are many many (infinitely many) microstates that are compatible with this description. We can conclusively say that in 15 minutes the gas will still be dispersed only if all of these microstates, when evolved forward 15 minutes, end up dispersed.

But, and here’s the crucial step, we also know that there exist very peculiar states (such as the macrostate in which all the gas particles have come together to form a perfect statuette of Michael Jackson) such that these states will in 15 minutes evolve to the dispersed state. And by time reversibility, this tells us that there is another perfectly valid history of the gas that starts uniformly dispersed and evolves over 15 minutes into a perfect statuette of Michael Jackson. That is, if we believe that complicated configurations of gases disperse, and believe that physics is time-reversible, then you must also believe that there are microstates compatible with dispersed states of gas that will in the next moment coalesce into some complicated configuration.

  1. A collection of gas shaped exactly like Michael Jackson will disperse uniformly across its container.
  2. Physics is time reversible.
  3. So uniformly dispersed gases can coalesce into a collection of gases shaped exactly like Michael Jackson.

At this point you might be thinking “yeah, sure, microphysics doesn’t in principle rule out the possibility that a uniformly dispersed gas will coalesce into Michael Jackson, or any other crazy configuration. But who cares? It’s so incredibly unlikely!” To which the response is: Yes, exactly, it’s extremely unlikely. But nothing in the microphysical laws says this! Look as hard as you can at the laws of motion, you will not find a probability distribution over the likelihood of the different microstates compatible with a given macrostate. And indeed, different initial conditions of the universe will give different such frequencies distributions! To make any statements about the relative likelihood of some microstates over others, you need some principle above and beyond the microphysical laws.

To summarize. All that microphysics + infinite computing power allows you to say about a macrostate is the following: Here are all the microstates that are compatible with that macrostate, and here are all the past and future histories of each of these microstates. And given time reversibility, these future histories cover an enormously diverse set of predictions about the future, from “the gas will disperse” to “the gas will form into a statuette of Michael Jackson”. To get reasonable predictions about how the world will actually behave, we need some other principle, a principle that allows us to disregard these “perverse” microstates. And microphysics contains no such principle.

Statistical mechanics is thus the study of the necessary augmentation to a fundamental theory of physics that allows us to make predictions about the world, given that we are not in the position to know its exact microstate. This necessary augmentation is known as the fundamental postulate of statistical mechanics, and it takes the form of a probability distribution over microstates. Some people describe the postulate as saying “all microstates being equally likely”, but that phrasing is a big mistake, as the sentence “all states are equally likely” is not well defined over a continuous set of states. (More on that in a bit.) To really understand the fundamental postulate, we have to introduce the notion of phase space.

The phase space for a system is a mathematical space in which every point represents a full specification of the positions and momenta of all particles in the system. So, for example, a system consisting of 1000 classical particles swimming around in an infinite universe would have 6000 degrees of freedom (three position coordinates and three momentum coordinates per particle). Each of these degrees of freedom is isomorphic to the real numbers. So phase space for this system must be 6000, and a point in phase space is a specification of the values of all 6000 degrees of freedom. In general, for N classical particles, phase space is 6N.

With the concept of phase space in hand, we can define the fundamental postulate of statistical mechanics. This is: the probability distribution over microstates compatible with a given macrostate is uniform over the corresponding volume of phase space.

It turns out that if you just measure the volume of the “perverse states” in phase space, you end up finding that it composes approximately 0% of the volume of compatible microstates in phase space. This of course allows us to say of perverse states, “Sure they’re there, and technically it’s possible that my system is in such a state, but it’s so incredibly unlikely that it makes virtually no impact on my prediction of the future behavior of my system.” And indeed, when you start going through the math and seeing the way that systems most likely evolve given the fundamental postulate, you see that the predictions you get match beautifully with our observations of nature.

Next time: What is the epistemic status of the fundamental postulate? Do we have good a priori reasons to believe it?

— — —

* There are some subtleties here. For one, we think that there actually is a very small time asymmetry in the weak nuclear force. And some collapse interpretations of quantum mechanics have the collapse of the wave function as an irreversible process, although Everettian quantum mechanics denies this. For the moment, let’s disregard all of that. The time asymmetry in the weak nuclear force is not going to have any relevant effect on the proof made here, besides making it uglier and more complicated. What we need is technically not exact time-reversibility, but very-approximate time-reversibility. And that we have. Collapsing wave functions are a more troubling business, and are a genuine way out of the argument made in this post.

A Cognitive Instability Puzzle, Part 2

This is a follow of this previous post, in which I present three unusual cases of belief updating. Read it before you read this.

I find these cases very puzzling, and I don’t have a definite conclusion for any of them. They share some deep similarities. Let’s break all of them down into their basic logical structure:

Joe
Joe initially believes in classical logic and is certain of some other stuff, call it X.
An argument A exists that concludes that X can’t be true if classical logic is true.
If Joe believes classical logic, then he believes A.
If Joe believes intuitionist logic, then he doesn’t believe A.

Karl
Karl initially believes in God and is certain of some other stuff about evil, call it E.
An argument A exists that concludes that God can’t exist if E is true.
If Karl believes in God, then he believes A.
If Karl doesn’t believe in God, then he doesn’t believe A.

Tommy
Tommy initially believes in her brain’s reliability and is certain of some other stuff about her experiences, call it Q.
An argument A exists that concludes that hat her brain can’t be reliable if Q is true.
If Tommy believes in her brain’s reliability, then she believes A.
If Tommy doesn’t believe in her brain’s reliability, then she doesn’t believe A.

First of all, note that all three of these cases are ones in which Bayesian reasoning won’t work. Joe is uncertain about the law of the excluded middle, without which you don’t have probability theory. Karl is uncertain about the meaning of the term ‘evil’, such that the same proposition switches from being truth-apt to being meaningless when he updates his beliefs. Probability theory doesn’t accommodate such variability in its language. And Tommy is entertaining a hypothesis according to which she no longer accepts any deductive or inductive logic, which is inconsistent with Bayesianism in an even more extreme way than Joe.

The more important general theme is that in all three cases, the following two things are true: 1) If an agent believes A, then they also believe an argument that concludes -A. 2) If that agent believes -A, then they don’t believe the argument that concludes -A.

Notice that if an agent initially doesn’t believe A, then they have no problem. They believe -A, and also happen to not believe that specific argument concluding -A, and that’s fine! There’s no instability or self-contradiction there whatsoever. So that’s really not where the issue lies.

The mystery is the following: If the only reason that an agent changed their mind from A to -A is the argument that they no longer buy, then what should they do? Once they’ve adopted the stance that A is false, should they stay there, reasoning that if they accept A they will be led to a contradiction? Or should they jump back to A, reasoning that the initial argument that led them there was flawed?

Said another way, should they evaluate the argument against A from their own standards, or from A’s standards? If they use their own standards, then they are in an unstable position, where they jump back and forth between A and -A. And if they always use A’s standards… well, then we get the conclusion that Tommy should believe herself to be a Boltzmann brain. In addition, if they are asked why they don’t believe A, then they find themselves in the weird position of giving an explanation in terms of an argument that they believe to be false!

I find myself believing that either Joe should be an intuitionist, Karl an atheist, and Tommy a radical skeptic, OR Joe a classical-logician, Karl a theist, and Tommy a reliability-of-brain-believer-in. That is, it seems like there aren’t any significant enough disanalogies between these three cases to warrant concluding one thing in one case and then going the other direction in another.

Logic, Theism, and Boltzmann Brains: On Cognitively Unstable Beliefs

First case

Propositional logic accepts that the proposition A-A is necessarily true. This is called the law of the excluded middle. Intuitionist logic differs in that it denies this axiom.

Suppose that Joe is a believer in propositional logic (but also reserves some credence for intuitionist logic). Joe also believes a set of other propositions, whose conjunction we’ll call X, and has total certainty in X.

One day Joe discovers that a contradiction can be derived from X, in a proof that uses the law of the excluded middle. Since Joe is certain that X is true, he knows that X isn’t the problem, and instead it must be the law of the excluded middle. So Joe rejects the law of the excluded middle and becomes an intuitionist.

The problem is, as an intuitionist, Joe now no longer accepts the validity of the argument that starts at X and concludes -X! Why? Because it uses the law of the excluded middle, which he doesn’t accept.

Should Joe believe in propositional logic or intuitionism?

Second case

Karl is a theist. He isn’t absolutely certain that theism is correct, but holds a majority of his credence in theism (and the rest in atheism). Karl is also 100% certain in the following claim: “If atheism is true, then the concept of ‘evil’ is meaningless”, and believes that logically valid arguments cannot be made using meaningless concepts.

One day somebody presents the problem of evil to Karl, and he sees it as a crushing objection to theism. He realizes that theism, plus some other beliefs about evil that he’s 100% confident in, leads to a contradiction. So since he can’t deny these other beliefs, he is led to atheism.

The problem is, as an atheist, Karl no longer accepts the validity of the argument that starts at theism and concludes atheism! Why? Because the arguments rely on using the concept of ‘evil’, and he is now certain that this concept is meaningless, and thus cannot be used in logically valid arguments.

Should Karl be a theist or an atheist?

Third case

Tommy is a scientist, and she believes that her brain is reliable. By this, I mean that she trusts her ability to reason both deductively and inductively. However, she isn’t totally certain about this, and holds out a little credence for radical skepticism. She is also totally certain about the content of her experiences, though not its interpretation (i.e. if she sees red, she is 100% confident that she is experiencing red, although she isn’t necessarily certain about what in the external world is causing the experience).

One day Tommy discovers that reasoning deductively and inductively from her experiences leads her to a model of the world that entails that her brain is actually a quantum fluctuation blipping into existence outside the event hole of a black hole. She realizes that this means that with overwhelmingly high probability, her brain is not reliable and is just producing random noise uncorrelated with reality.

The problem is, if Tommy believes that her brain is not reliable, then she can no longer accept the validity of the argument that led her to this position! Why? Well, she no longer trusts her ability to reason deductively or inductively. So she can’t accept any argument, let alone this particular one.

What should Tommy believe?

— — —

How are these three cases similar and different? If you think that Joe should be an intuitionist, or Karl an atheist, then should Tommy believe herself to be a black hole brain? Because it turns out that many cosmologists have found themselves to be in a situation analogous to Case 3! (Link.) I have my own thoughts on this, but I won’t share them for now.

Group Theory: The Mathematics of Symmetry?

Often you’ll hear people describe group theory as being “the mathematics of symmetry.” If you’ve been exposed to a little bit of group theory, you might find this a little mystifying. While there are some subsets of group theory that explicitly pertain to symmetry (in particular, the dihedral groups, representing the symmetries of regular polygons), there are many more groups that seem to have nothing to do with symmetry (the rational numbers, for example, or the dicyclic groups). Much of group theory involves proving theorems about the structure of groups in abstract, and many of these theorems again seem to tell us little to nothing about symmetry (what does “for every prime p that divides the size of a group G, there is a subgroup of G with order p” mean as a statement about the nature of symmetry?). So what’s up? Why this identification between groups and symmetry?

Well, I think we can start with asking what we even mean by ‘symmetry’. The first thing I think of when I hear the word ‘symmetry’ is geometric symmetry – the rotations, translations, and reflections that hold fixed a given shape. This is indeed one big part of symmetry, but not all of it. Symmetry is huge in physics as well, and the meaning in this context is rarely geometric symmetry. Instead, ‘symmetry’ finds its deepest application in the notion of symmetries of nature, meaning transformations that hold fixed the form of the laws of nature. For instance, the behavior of physical systems would be identical if we were to uniformly “slide” the universe in one direction by some fixed amount. So the laws of nature must be invariant with respect to this type of uniform translation, and we say that uniform spatial translation is a symmetry of nature. But the behavior of physical systems would be affected if we, say, began uniformly accelerating all of space in some direction. So uniform acceleration is not a symmetry of nature, and the laws of nature will be sensitive to these types of transformations. In this context, symmetry is more or less synonymous with invariance.

What’s the similarity between these two senses of symmetry? Invariance really is the key. Geometric symmetries are transformations that hold fixed certain geometric properties; physical symmetries are transformations that hold fixed certain physical properties. Fundamentally, symmetries are transformations that hold fixed some quantity. When you choose strange abstract qualities, your symmetries can get pretty weird (like the invariance of magnetic fields with respect to addition of curl-less vector fields to the magnetic vector potential, one of the two gauge symmetries of Maxwell’s equations). But they are symmetries nonetheless!

Now, how does this expansive concept of symmetries relate to groups? Well, looking at the group axioms, it becomes fairly obvious that they do actually fit pretty well with the concept. Let’s think of the elements of a group as transformations one could perform upon a system, and the group operation as being ‘transformation composition’ for successive transformations. And finally, we think of the group as being a complete set of symmetries. Now we’ll look at what the group axioms say, one at a time.

  1. Closure
    For any two elements a, b in G, a•b is in G.
    – Translation: If two transformations each individually hold fix some quantity, then doing the two transformations successively will also hold fixed the quantity.
  2. Associativity
    For any a, b, c in G, (a•b)•c = a•(b•c)
    – Translation: The order in which transformations are applied to an object is all that determines the nature of the net transformation. The notion of parentheses doesn’t make sense in this context, there is only the left-to-right order of the application of transformations.
  3. Identity
    There is an element e in G such that for any a in G, a•e = e•a = a.
    – Translation: Doing nothing at all holds fixed whatever quantity you choose.
  4. Inverses
    For any a in G, there is an element a’ in G such that a•a’ = e
    – Translation: Any transformation that holds fixed some quantity will also hold fixed that quantity if done in reverse.

Put together, we see that it makes sense to think about a group as the complete set of reversible transformations that hold fixed a certain quantity. (To be clear, this indicates that any such set of symmetries can be treated as a group, not necessarily that any group can be regarded as a set of symmetries. However, this does turn out to be the case! And in fact, any group whatsoever can be considered as a set of geometric symmetries of some abstract object!)

“Complete” needs to be fleshed out a bit more. When talking about a set of transformations, we need to be picking from a well-defined starting set of ‘allowed transformations’. For example, sets of geometric symmetries of objects are usually drawn from the set of all isometries (transformations that keep all distances the same, i.e. rotations, reflections, and translations). Instead of this, we could also consider the allowed transformations to be restricted to the set of all rotations, in which case we’d just be looking at the set of all rotations that don’t change the object. But in general, we are not allowed to talk about the set of possible transformations as just being some subset of the set of all rotations. Why? Because we require that the set of allowed transformations be closed. If a certain rotation is in consideration as an allowed transformation of an object, then so must be twice that rotation, thrice it, and so on. And so must be the reverse of that rotation. Not all sets of rotations satisfy this constraint!

The general statement is that the set of allowed transformations from which we draw our set of symmetries must be closed. And if it is, then the resulting set of symmetries is well described as some group!

So there is a fairly simple sense in which group theory is the study of symmetries. There are the obvious geometric symmetries that I already mentioned (the dihedral groups). But now we can think further about what other familiar groups are actually symmetries of. (, +) is a group, so what are the integers symmetries of? There’s actually a few answers that work here. One is that is the set of translational symmetries of an infinite line of evenly spaced dots. Translating to the right corresponds to positive numbers, to the left is negatives, and no translation at all is 0. What about (n, +)? It is the set of rotational symmetries of a circle with n dots placed evenly along its perimeter. I encourage you to try to think about other examples of groups (symmetric and alternating groups, ℚ, and so on), and how to think about them as symmetry groups.

This way of thinking about groups is more than just visually appealing, it actually helps clarify some otherwise opaque results in group theory. For example, why is the stabilizer of a group action always a subgroup? Well, the group action defines a set of allowed transformations, and the stabilizer of x is exactly the set of transformations that hold x fixed! So we have an invariant quantity x, and thus it is perfectly reasonable that the stabilizer should be a group.

Thinking of groups as sets of symmetries also allows us a nice physical interpretation of the idea of conjugacy classes, a hugely important tool in group theory. Remember, the conjugacy class of an element x in G is just the set of all conjugates of x; Cl(x) = {gxg-1 for all g in G}. This ABA-1 pattern might be familiar to you if you’ve studied a little bit of matrix representations of coordinate systems in physics. In this context, ABA-1 just represents a coordinate transformation of B! First we apply some transformation A that changes our reference frame, then we apply B, and then we undo A to come back to the original reference frame. The result is basically that we get a transformation that are just like B, but in a different reference frame! So it makes perfect sense to think of the conjugacy class of an element x as just the set of all elements of the same “type” as x.

For a dihedral group, we get one conjugacy class for the identity, another for all rotations, and another for all flips. Why? Well every flip can be represented as any other flip in a rotated reference frame! And similarly, every rotation can be represented as any other rotation in a flipped reference frame. And of course, the do-nothing transformation will look the same in all reference frames, which is why the identity has its own conjugacy class. And for another example, you can see some great visualizations of the conjugacy class breakdown of the cube’s symmetry group here, under Examples at the top. In general, looking at conjugacy classes is a good way to get a sense of the natural breakdown of categories in your set of symmetries.

Now, the famous theorems about group structure (Lagrange, Cauchy, Sylow) all have intriguing interpretations as applied to symmetries. Why must it be that any object with a prime number of symmetries cannot have any non-trivial smaller complete sets of symmetries? And doesn’t it seem totally non-obvious that an object that has a complete set of symmetries whose size is divisible by 3, say, must also have a smaller complete set of symmetries (one drawn from a smaller closed set of allowed transformations) whose size is 3? But it’s true! Think about the symmetries of a regular triangle. There are six of them, and correspondingly there’s a closed set of symmetries of size 3, namely, the do-nothing transformation and the two rotations! And the same thing applies for any symmetry group of size 3N.

Furthermore, Sylow’s theorems give us really strong constraints on the number of subgroups of various sizes, none of which seem at all intuitive. Why must it be that if an object has 9N symmetries for some N that’s not divisible by 3, then N must be divisible by the number of smaller complete sets of symmetries?? Thinking about and trying to make intuitive sense of these connections is quite mind-boggling to me.

The EPR Paradox

The Paradox

I only recently realized how philosophical the original EPR paper was. It starts out by providing a sufficient condition for something to be an “element of reality”, and proceeds from there to try to show the incompleteness of quantum mechanics. Let’s walk through this argument here:

The EPR Reality Condition: If at time t we can know the value of a measurable quantity with certainty without in any way disturbing the system, then there is an element of reality corresponding to that measurable quantity at time t. (i.e. this is a sufficient condition for a measurable property of a system at some moment to be an element of the reality of that system at that moment:)

Example 1: If you measure an electron spin to be up in the z direction, then quantum mechanics tells you that you can predict with certainty that the spin in the z direction will up at any future measurement. Since you can predict this with certainty, there must be an aspect or reality corresponding to the electron z-spin after you have measured it to be up the first time.

Example 2: If you measure an electron spin to be up in the z-direction, then QM tells you that you cannot predict the result of measuring the spin in the x-direction at a later time. So the EPR reality condition does not entail that the x-spin is an element of the reality of this electron. It also doesn’t entail that the x-spin is NOT an element of the reality of this electron, because the EPR reality condition is merely a sufficient condition, not a necessary condition.

Now, what does the EPR reality condition have to say about two particles with entangled spins? Well, suppose the state of the system is initially

|Ψ> = (|↑↓ – |↓↑) / √2

This state has the unusual property that it has the same form no matter what basis you express it in. You can show for yourself that in the x-spin basis, the state is equal to

|Ψ> = (|→← – |←→) / √2

Now, suppose that you measure the first electron in the z-basis and find it to be up. If you do this, then you know with certainty that the other electron will also be measured to be up. This means that after measuring it in the z-basis, the EPR reality condition says that electron 2 has z-spin up as an element of reality.

What if you instead measure the first electron in the x-basis and find it to be right? Well, then the EPR reality condition will tell you that the electron 2 has x-spin right as an element of reality.

Okay, so we have two claims:

  1. That after measuring the z-spin of electron 1, electron 2 has a definite z-spin, and
  2. that after measuring the x-spin of electron 1, electron 2 has a definite x-spin.

But notice that these two claims are not necessarily inconsistent with the quantum formalism, since they refer to the state of the system after a particular measurement. What’s required to bring out a contradiction is a further assumption, namely the assumption of locality.

For our purposes here, locality just means that it’s possible to measure the spin of electron 1 in such a way as to not disturb the state of electron 2. This is a really weak assumption! It’s not saying that any time you measure the spin of electron 1, you will not have disturbed electron 2. It’s just saying that it’s possible in principle to set up a measurement of the first electron in such a way as to not disturb the second one. For instance, take electrons 1 and 2 to opposite sides of the galaxy, seal them away in totally closed off and causally isolated containers, and then measure electron 1. If you agree that this should not disturb electron 2, then you agree with the assumption of locality.

Now, with this additional assumption, Einstein Podolsky and Rosen realized that our earlier claims (1) and (2) suddenly come into conflict! Why? Because if it’s possible to measure the z-spin of electron 1 in a way that doesn’t disturb electron 2 at all, then electron 2 must have had a definite z-spin even before the measurement of electron 1!

And similarly, if it’s possible to measure the x-spin of electron 1 in a way that doesn’t disturb electron 2, then electron 2 must have had a definite x-spin before the first electron was measured!

What this amounts to is that our two claims become the following:

  1. Electron 2 has a definite z-spin at time t before the measurement.
  2. Electron 2 has a definite x-spin at time t before the measurement.

And these two claims are in direct conflict with quantum theory! Quantum mechanics refuses to assign a simultaneous x and z spin to an electron, since these are incompatible observables. This entails that if you buy into locality and the EPR reality condition, then you must believe that quantum mechanics is an incomplete description of nature, or in other words that there are elements of reality that can not described by quantum mechanics.

The Resolution(s)

Our argument rested on two premises: the EPR reality condition and locality. Its conclusion was that quantum mechanics was incomplete. So naturally, there are three possible paths you can take to respond: accept the conclusion, deny the second premise, or deny the first premise.

To accept the conclusion is to agree that quantum mechanics is incomplete. This is where hidden variable approaches fall, and was the path that Einstein dearly hoped would be vindicated. For complicated reasons that won’t be covered in this post, but which I talk about here, the prospects for any local realist hidden variables theory (which was what Einstein wanted) look pretty dim.

To deny the second premise is to say that in fact, measuring the spin of the first electron necessarily disturbs the state of the second electron, no matter how you set things up. This is in essence a denial of locality, since the two electrons can be time-like separated, meaning that this disturbance must have propagated faster than the speed of light. This is a pretty dramatic conclusion, but is what orthodox quantum mechanics in fact says. (It’s implied by the collapse postulate.)

To deny the first premise is to say that in fact there can be some cases in which you can predict with certainty a measurable property of a system, but where nonetheless there is no element of reality corresponding to this property. I believe that this is where Many-Worlds falls, since measurement of z-spin doesn’t result in an electron in an unambiguous z-spin state, but in a combined superposition of yourself, your measuring device, the electron, and the environment. Needless to say, in this complicated superposition there is no definite fact about the z-spin of the electron.

I’m a little unsure about where the right place to put psi-epistemic approaches like Quantum Bayesianism, which resolve the paradox by treating the wave function not as a description of reality, but solely as a description of our knowledge. In this way of looking at things, it’s not surprising that learning something about an electron at one place can instantly tell you something about an electron at a distant location. This does not imply any faster-than-light communication, because all that’s being described is the way that information-processing occurs in a rational agent’s brain.

Measurement without interaction in quantum mechanics

In front of you is a sealed box, which either contains nothing OR an incredibly powerful nuclear bomb, the explosion of which threatens to wipe out humanity permanently. Even worse, this bomb is incredibly unstable and will blow up at the slightest contact with a single photon. This means that anybody that opens the box to look inside and see if there really is a bomb in there would end up certainly activating it and destroying the world. We don’t have any way to deactivate the bomb, but we could maintain it in isolation for arbitrarily long, despite the prohibitive costs of totally sealing it off from all contact.

Now, for obvious reasons, it would be extremely useful to know whether or not the bomb is actually active. If it’s not, the world can breathe a sigh of relief and not worry about spending lots of money on keeping it sealed away. And if it is, we know that the money is worth spending.

The obvious problem is that any attempt to test whether there is a bomb inside will involve in some way interacting with the box’s contents. And as we know, any such interaction will cause the bomb to detonate! So it seems that we’re stuck in this unfortunate situation where we have to act in ignorance of the full details of the situation. Right?

Well, it turns out that there’s a clever way that you can use quantum mechanics to do an “interaction-free measurement” that extracts some information from the system without causing the bomb to explode!

To explain this quantum bomb tester, we have to first start with a simpler system, a classic quantum interferometer setup:

56290881_2206928609636330_5087758915278471168_n.jpg

At the start, a photon is fired from the laser on the left. This photon then hits a beam splitter, which deflects the path of the photon with probability 50% and otherwise does nothing. It turns out that a photon that gets deflected by the beam splitter will pick up a 90º phase, which corresponds to multiplying the state vector by exp(iπ/2) = i. Each path is then redirected to another beam splitter, and then detectors are aligned across the two possible trajectories.

What do we get? Well, let’s just go through the calculation:

56414061_2282711835312851_8013422270024253440_n.jpg

We get destructive interference, which results in all photons arriving at detector B.

Now, what happens if you add a detector along one of the two paths? It turns out that the interference vanishes, and you find half the photons at detector A and the other half at detector B! That’s pretty weird… the observed frequencies appear to depend on whether or not you look at which path the photon went on. But that’s not quite right, because it turns out that you still get the 50/50 statistics whenever you place anything along one path whose state is changed by the passing photon!

Huh, that’s interesting… it indicates that by just looking for a photon at detector A, we can get evidence as to whether or not something interacted with the photon on the way to the detector! If we see a photon show up at the detector, then we know that there must have been some device which changed in state along the bottom path. Maybe you can already see where we’re going with this…

56451973_332761710713682_6105714719035752448_n.jpg

We have to put the box in the bottom path in such a way that if the box is empty, then when the photon passes by, nothing will change about either its state or the state of the photon. And if the box contains the bomb, then it will function like a detector (where the detection corresponds to whether or not the bomb explodes)!

Now, assuming that the box is empty, we get the same result as above. Let’s calculate the result we get assuming that the box contains the bomb:

56500763_2329442300624457_6272255095300161536_n

Something really cool happens here! We find that if the bomb is active, there is a 25% chance that the photon arrives as A without the bomb exploding. And remember, the photon arriving at detector A allows us to conclude with certainty that the bomb is active! In other words, this setup gives us a 25% chance of safely extracting that information!

25% is not that good, you might object. But it sure is better than 0%! And in fact, it turns out that you can strengthen this result, using a more complicated interferometer setup to learn with certainty whether the bomb is active with an arbitrarily small chance of setting off the bomb!

There’s so many weird little things about quantum mechanics that defy our classical intuitions, and this “interaction-free measurements” is one of my new favorites.

Is the double slit experiment evidence that consciousness causes collapse?

No! No no no.

This might be surprising to those that know the basics of the double slit experiment. For those that don’t, very briefly:

A bunch of tiny particles are thrown one by one at a barrier with two thin slits in it, with a detector sitting on the other side. The pattern on the detector formed by the particles is an interference pattern, which appears to imply that each particle went through both slits in some sense, like a wave would do. Now, if you peek really closely at each slit to see which one each particle passes through, the results seem to change! The pattern on the detector is no longer an interference pattern, but instead looks like the pattern you’d classically expect from a particle passing through only one slit!

When you first learn about this strange dependence of the experimental results on, apparently, whether you’re looking at the system or not, it appears to be good evidence that your conscious observation is significant in some very deep sense. After all, observation appears to lead to fundamentally different behavior, collapsing the wave to a particle! Right?? This animation does a good job of explaining the experiment in a way that really pumps the intuition that consciousness matters:

(Fair warning, I find some aspects of this misleading and just plain factually wrong. I’m linking to it not as an endorsement, but so that you get the intuition behind the arguments I’m responding to in this post.)

The feeling that consciousness is playing an important role here is a fine intuition to have before you dive deep into the details of quantum mechanics. But now consider that the exact same behavior would be produced by a very simple process that is very clearly not a conscious observation. Namely, just put a single spin qubit at one of the slits in such a way that if the particle passes through that slit, it flips the spin upside down. Guess what you get? The exact same results as you got by peeking at the screen. You never need to look at the particle as it travels through the slits to the detector in order to collapse the wave-like behavior. Apparently a single qubit is sufficient to do this!

It turns out that what’s really going on here has nothing to do with the collapse of the wave function and everything to do with the phenomenon of decoherence. Decoherence is what happens when a quantum superposition becomes entangled with the degrees of freedom of its environment in such a way that the branches of the superposition end up orthogonal to each other. Interference can only occur between the different branches if they are not orthogonal, which means that decoherence is sufficient to destroy interference effects. This is all stuff that all interpretations of quantum mechanics agree on.

Once you know that decoherence destroys interference effects (which all interpretations of quantum mechanics agree on), and also that a conscious observing the state of a system is a process that results in extremely rapid and total decoherence (which everybody also agrees on), then the fact that observing the position of the particle causes interference effects to vanish becomes totally independent of the question of what causes wave function collapse. Whether or not consciousness causes collapse is 100% irrelevant to the results of the experiment, because regardless of which of these is true, quantum mechanics tells us to expect observation to result in the loss of interference!

This is why whether or not consciousness causes collapse has no real impact on what pattern shows up in the wall. All interpretations of quantum mechanics agree that decoherence is a thing that can happen, and decoherence is all that is required to explain the experimental results. The double slit experiment provides no evidence for consciousness causing collapse, but it also provides no evidence against it. It’s just irrelevant to the question! That said, however, given that people often hear the experiment presented in a way that makes it seem like evidence for consciousness causing collapse, hearing that qubits do the same thing should make them update downwards on this theory.

Decoherence is not wave function collapse

In the double slit experiment, particles travelling through a pair of thin slits exhibit wave-like behavior, forming an interference pattern where they land that indicates that the particles in some sense travelled through both slits.

54518134_416516682243379_7426065872885645312_n.jpg

Now, suppose that you place a single spin bit at the top slit, which starts off in the state |↑⟩ and flips to |↓⟩ iff a particle travels through the top slit. We fire off a single particle at a time, and then each time swap out that spin bit for a new spin bit that also starts off in the state |↑⟩. This serves as an extremely simple measuring device which encodes the information about which slit each particle went through.

Now what will you observe on the screen? It turns out that you’ll observe the classically expected distribution, which is a simple average over the two individual possibilities without any interference.

53857812_1014378252080175_1814900750900264960_n.jpg

Okay, so what happened? Remember that the first pattern we observed was the result of the particles being in a superposition over the two possible paths, and then interfering with each other on the way to the detector screen. So it looks like simply having one bit of information recording the path of the particle was sufficient to collapse the superposition! But wait! Doesn’t this mean that the “consciousness causes collapse” theory is wrong? The spin bit was apparently able to cause collapse all by itself, so assuming that it isn’t a conscious system, it looks like consciousness isn’t necessary for collapse! Theory disproved!

No. As you might be expecting, things are not this simple. For one thing, notice that this ALSO would prove as false any other theory of wave function collapse that doesn’t allow single bits to cause collapse (including anything about complex systems or macroscopic systems or complex information processing). We should be suspicious of any simple argument that claims to conclusively prove a significant proportion of experts wrong.

To see what’s going on here, let’s look at what happens if we don’t assume that the spin bit causes the wave function to collapse. Instead, we’ll just model it as becoming fully entangled with the path of the particle, so that the state evolution over time looks like the following:

54114815_2222391794691111_7588527245694599168_n

Now if we observe the particle’s position on the screen, the probability distribution we’ll observe is given by the Born rule. Assuming that we don’t observe the states of the spin bits, there are now two qualitatively indistinguishable branches of the wave function for each possible position on the screen. This means that the total probability for any given landing position will be given by the sum of the probabilities of each branch:

54432750_422324315183100_8833216438886989824_n-e1552873260955.jpg

But hold on! Our final result is identical to the classically expected result! We just get the probability of the particle getting to |j⟩ from |A⟩, multiplied by the probability of being at |A⟩ in the first place (50%), plus the probability of the particle going from |B⟩ to |j⟩ times the same 50% for the particle getting to |B⟩.

54220547_1042042785982437_8822680544807485440_n.jpg

In other words, our prediction is that we’d observe the classical pattern of a bunch of individual particles, each going through exactly one slit, with 50% going through the top slit and 50% through the bottom. The interference has vanished, even though we never assumed that the wave function collapsed!

What this shows is that wave function collapse is not required to get particle-like behavior. All that’s necessary is that the different branches of the superposition end up not interfering with each other. And all that’s necessary for that is environmental decoherence, which is exactly what we had with the single spin bit!

In other words, environmental decoherence is sufficient to produce the same type of behavior that we’d expect from wave function collapse. This is because interference will only occur between non-orthogonal branches of the wave function, and the branches become orthogonal upon decoherence (by definition). A particle can be in a superposition of multiple states but still act as if it has collapsed!

Now, maybe we want to say that the particle’s wave function is collapsed when its position is measured by the screen. But this isn’t necessary either! You could just say that the detector enters into a superposition and quickly decoheres, such that the different branches of the wave function (one for each possible detector state) very suddenly become orthogonal and can no longer interact. And then you could say that the collapse only really happens once a conscious being observes the detector! Or you could be a Many-Worlder and say that the collapse never happens (although then you’d have to figure out where the probabilities are coming from in the first place).

You might be tempted to say at this point: “Well, then all the different theories of wave function collapse are empirically equivalent! At least, the set of theories that say ‘wave function collapse = total decoherence + other necessary conditions possibly’. Since total decoherence removes all interference effects, the results of all experiments will be indistinguishable from the results predicted by saying that the wave function collapsed at some point!”

But hold on! This is forgetting a crucial fact: decoherence is reversible, while wave function collapse is not!!! 

Screen Shot 2019-03-17 at 8.21.51 PM
Pretty picture from doi: 10.1038/srep15330

Let’s say that you run the same setup before with the spin bit recording the information about which slit the particle went through, but then we destroy that information before it interacts with the environment in any way, therefore removing any traces of the measurement. Now the two branches of the wave function have “recohered,” meaning that what we’ll observe is back to the interference pattern! (There’s a VERY IMPORTANT caveat, which is that the time period during which we’re destroying the information stored in the spin bit must be before the particle hits the detector screen and the state of the screen couples to its environment, thus decohering with the record of which slit the particle went through).

If you’re a collapse purist that says that wave function collapse = total decoherence (i.e. orthogonality of the relevant branches of the wave function), then you’ll end up making the wrong prediction! Why? Well, because according to you, the wave function collapsed as soon as the information was recorded, so there was no “other branch of the wave function” to recohere with once the information was destroyed!

This has some pretty fantastic implications. Since IN PRINCIPLE even the type of decoherence that occurs when your brain registers an observation is reversible (after all, the Schrodinger equation is reversible), you could IN PRINCIPLE recohere after an observation, allowing the branches of the wave function to interfere with each other again. These are big “in principle”s, which is why I wrote them big. But if you could somehow do this, then the “Consciousness Causes Collapse” theory would give different predictions from Many-Worlds! If your final observation shows evidence of interference, then “consciousness causes collapse” is wrong, since apparently conscious observation is not sufficient to cause the other branches of the wave function to vanish. Otherwise, if you observe the classical pattern, then Many Worlds is wrong, since the observation indicates that the other branches of the wave function were gone for good and couldn’t come back to recohere.

This suggests a general way to IN PRINCIPLE test any theory of wave function collapse: Look at processes right beyond the threshold where the theory says wave functions collapse. Then implement whatever is required to reverse the physical process that you say causes collapse, thus recohering the branches of the wave function (if they still exist). Now look to see if any evidence of interference exists. If it does, then the theory is proven wrong. If it doesn’t, then it might be correct, and any theory of wave function collapse that demands a more stringent standard for collapse (including Many-Worlds, the most stringent of them all) is proven wrong.