Deriving the Lorentz transformation

My last few posts have been all about visualizing the Lorentz transformation, the coordinate transformation in special relativity. But where does this transformation come from? In this post, I’ll derive it from basic principles. I saw this derivation first probably a year ago, and have since tried unsuccessfully to re-find the source.  It isn’t the algebraically simplest derivation I’ve seen, but it is the conceptually simplest. The principles we’ll use to derive the transformation should all seem extremely obvious to you.

So let’s dive straight in!

The Lorentz transformation in full generality is a 4D matrix that tells you how to transform spacetime coordinates in one inertial reference frame to spacetime coordinates in another inertial reference frame. It turns out that once you’ve found the Lorentz transformation for one spatial dimension, it’s quite simple to generalize it to three spatial dimensions, so for simplicity we’ll just stick to the 1D case. The Lorentz transformation also allows you to transform to a coordinate system that is both translated some distance and rotated some angle. Both of these are pretty straightforward, and work the way we intuitively think rotation and translation should work. So I’ll not consider them either. The interesting part of the Lorentz transformation is what happens when we translate to reference frames that are co-moving (moving with respect to one another). Strictly speaking, this is called a Lorentz boost. That’s what I’ll be deriving for you: the 1D Lorentz boost.

So, we start by imagine some reference frame, in which an event is labeled by its temporal and spatial coordinates: t and x. Then we look at a new reference frame moving at velocity v with respect to the starting reference frame. We describe the temporal and spatial coordinates of the same event in the new coordinate system: t’ and x’. In general, these new coordinates can be any function whatsoever of the starting coordinates and the velocity v.

Screen Shot 2018-12-09 at 10.31.11 PM.png

To narrow down what these functions f and g might be, we need to postulate some general relationship between the primed and unprimed coordinate system.

So, our first postulate!

1. Straight lines stay straight.

Our first postulate is that all observers in inertial reference frames will agree about if an object is moving at a constant velocity. Since objects moving at constant velocities are straight lines on diagrams of position vs time, this is equivalent to saying that a straight path through spacetime in one reference frame is a straight path through spacetime in all reference frames.

More formally, if x is proportional to t, then x’ is proportional to t’ (though the constant of proportionality may differ).

Screen Shot 2018-12-09 at 10.41.03 PM.png

This postulate turns out to be immensely powerful. There is a special name for the types of transformations that keep straight lines straight: they are linear transformations. (Note, by the way, that the linearity is only in the coordinates t and x, since those are the things that retain straightness. There is no guarantee that the dependence on v will be linear, and in fact it will turn out not to be.)

 These transformations are extremely simple, and can be represented by a matrix. Let’s write out the matrix in full generality:

Screen Shot 2018-12-09 at 10.45.02 PM.png

We’ve gone from two functions (f and g) to four (A, B, C, and D). But in exchange, each of these four functions is now only a function of one variable: the velocity v. For ease of future reference, I’ve chosen to name the matrix T(v).

So, our first postulate gives us linearity. On to the second!

2. An object at rest in the starting reference frame is moving with velocity -v in the moving reference frame

This is more or less definitional. If somebody tells you that they had a function that transformed coordinates from one reference frame to a moving reference frame, then the most basic check you can do to see if they’re telling the truth is verify that objects at rest in the starting reference frame end up moving in the final reference frame. And again, it seems to follow from what it means for the reference frame to be moving right at 1 m/s that the initially stationary objects should end up moving left at 1 m/s.

Let’s consider an object sitting at rest at x = 0 in the starting frame of reference. Then we have:

Screen Shot 2018-12-09 at 10.52.06 PM.png

We can plug this into our matrix to get a constraint on the functions A and C:

Screen Shot 2018-12-09 at 10.54.59 PM.png

Great! We’ve gone from four functions to three!

Screen Shot 2018-12-09 at 10.56.02 PM.png

3. Moving to the left at velocity v and to the right at the same velocity is the same as not moving at all

More specifically: Start with any reference frame. Now consider a new reference frame that is moving at velocity v with respect to the starting reference frame. Now, from this new reference frame, consider a third reference frame that is moving at velocity -v. This third reference frame should be identical to the one we started with. Got it?

Formally, this is simply saying the following:

Screen Shot 2018-12-09 at 11.01.36 PM.png

(I is the identity matrix.)

To make this equation useful, we need to say more about T(-v). In particular, it would be best if we could express T(-v) in terms of our three functions A(v), B(v), and D(v). We do this with our next postulate:

4. Moving at velocity -v is the same as turning 180°, then moving at velocity v, then turning 180° again.

Again, this is quite self-explanatory. As a geometric fact, the reference frame you end up with by turning around, moving at velocity v, and then turning back has got to be the same as the reference frame you’d end up with by moving at velocity -v. All we need to formalize this postulate is the matrix corresponding to rotating 180°.

Screen Shot 2018-12-09 at 11.07.28 PM.png

There we go! Rotating by 180° is the same as taking every position in the starting reference frame and flipping its sign. Now we can write our postulate more precisely:

Screen Shot 2018-12-09 at 11.09.47 PM

Screen Shot 2018-12-09 at 11.10.44 PM.png

Now we can finally use Postulate 3!

Screen Shot 2018-12-09 at 11.11.56 PM

Doing a little algebra, we get…

Screen Shot 2018-12-09 at 11.12.42 PM.png

(You might notice that we can only conclude that A = D if we reject the possibility that A = B = 0. We are allowed to do this because allowing A = B = 0 gives us a trivial result in which a moving reference frame experiences no time. Prove this for yourself!)

Now we have managed to express all four of our starting functions in terms of just one!

Screen Shot 2018-12-09 at 11.18.23 PM.png

So far our assumptions have been grounded by almost entirely a priori considerations about what we mean by velocity. It’s pretty amazing how far we got with so little! But to progress, we need to include one final a posteriori postulate, that which motivated Einstein to develop special relativity in the first place: the invariance of the speed of light.

5. Light’s velocity is c in all reference frames.

The motivation for this postulate comes from mountains of empirical evidence, as well as good theoretical arguments from the nature of light as an electromagnetic phenomenon. We can write it quite simply as:

Screen Shot 2018-12-09 at 11.43.23 PM

Plugging in our transformation, we get:

Screen Shot 2018-12-09 at 11.43.28 PM

Multiplying the time coordinate by c must give us the space coordinate:

Screen Shot 2018-12-10 at 3.27.16 AM

And we’re done with the derivation!

Summarizing our five postulates:

Screen Shot 2018-12-10 at 12.37.23 AM.png

And our final result:

Screen Shot 2018-12-10 at 3.29.09 AM.png

Swapping the past and future

There are a few more cool things you can visualize with the special relativity program from my last post.

First of all, a big theme of the last post was the ambiguity of temporal orderings. It’s easy to see the temporal ordering of events when there are only three, but gets harder when you have many many events. Let’s actually display the temporal order on the visualization, so that we can see how it changes for different frames of reference.

Display Order Of Three Events

Order of Many Events.gif

Looking at this second GIF, you can see the immense ambiguity that there is in the temporal order of events.

Now, where things get even more interesting is when we consider the spacetime coordinates of events that are not in your future light cone. Check this out:

Outside the Light Cone.gif

Here’s a more detailed image of the paths traced out by events as you change your velocity:

Screen Shot 2018-12-06 at 10.22.20 PM.png

Instead of just looking at events in your future light cone, we’re now also looking at events outside of your light cone!

We chose to look at a bunch of events that are initially all in your future (in the frame of reference where v = 0). Notice now that as we vary the velocity, some of these events end up at earlier times than you! In other words, by changing your frame of reference, events that were in your future can end up in your past. And vice versa; events in the past of one frame of reference can be in the future in the other.

We can see this very clearly by considering just two events.

Future Past Swap.gif

In the v = 0 frame, Red and Green are simultaneous with you. But for v > 0, Green is before Red is before you, and for v < 0, Green is after Red is after you. The lesson is the following: when considering events outside of your light cone there is no fact of the matter about what events are in your future and which ones are in your past.

Now, notice that in the above GIFs we never see events that are in causal contact leave causal contact, or vice versa. This holds true in general. While things certainly do get weirder when considering events outside your light cone, it is still the case that all observers will agree on what events are in causal contact with one another. And just like before, the temporal ordering of events in causal contact does not depend on your frame of reference. In other words, basketballs are always tossed before they go through the net, even outside your light cone.

The same holds when considering interactions between a pair of events that straddle either side of your light cone:

Straddling No Cause.gif

Straddling With Cause

If A is in B’s light cone from one frame of reference, then A is in B’s light cone from all frames of reference. And if A is out of B’s light cone in one frame of reference, then it is out of B’s light cone in all frames of reference. Once again, we see that special relativity preserves as absolute our bedrock intuitions about causality, even when many of our intuitions about time’s objectivity fall away.

Now, all of the implications of special relativity that I’ve discussed so far have been related to time and causality. But there’s also some strange stuff that happens with space. For instance, let’s consider a series of events corresponding to an object sitting at rest some distance away from you. On our diagram this looks like the following:

Screen Shot 2018-12-08 at 11.12.10 PM.png

What does this look like when we if we are moving towards the object? Obviously the object should now be getting closer to us, so we expect the red line to tilt inwards towards the x = 0 point. Here’s what we see at 80% of the speed of light:

Screen Shot 2018-12-08 at 11.14.01 PM.png

As we expected, the object now rushes towards us from our frame of reference, and quickly passes us by and moves off to the left. But notice the spatial distortion in the image! At the present moment (t = 0), the object looks significantly closer than it was previously. (You can see this by starting from the center point and looking to the right to see how much distance you cover before intersecting with the object. This is the distance to the object at t = 0.)

This is extremely unusual! Remember, the moving frame of reference is at the exact same spatial position at t = 0 as the still frame of reference. So whether I am moving towards an object or standing still appears to change how far away the object presently is!

This is the famous phenomenon of length contraction. If we imagine placing two objects at different distances from the origin, each at rest with respect to the v = 0 frame, then moving towards them would result in both of them getting closer to us as well as each other, and thus shrinking! Evidently when we move, the universe shrinks!

Contraction

One last effect we can see in the diagram appears to be a little at odds with what I’ve just said. This is that the observed distance between yourself and the object increases as you move towards it (and as the actual distance shrinks). Why? Well, what you observe is dictated by the beams of light that make it to your eye. So at the moment t = 0, what you are observing is everything along the two diagonals in the bottom half of the images. And in the second image, where you are moving towards the object, the place where the object and diagonal intersect is much further away than it is in the first image! Evidently, moving towards an object makes it appear further away, even though in reality it is getting closer to you!

This holds as a general principle. The reason? When you observe an object, you are really observing it as it was some time in the past (however much time it took for light to reach your eye). And when you move towards an object, that past moment you are observing falls further into the past. (This is sort of the flip-side of time dilation.) Since you are moving towards the object, looking further into the past means looking at the object when it was further away from you. And so therefore the object ends up appearing more distant from you than before!

There’s a bunch more weird and fascinating effects that you can spot in these types of visualizations, but I’ll stop there for now.

Visualizing Special Relativity

I’ve been thinking a lot about special relativity recently, and wrote up a fun program for visualizing some of its stranger implications. Before going on to these visualizations, I want to recommend the Youtube channel MinutePhysics, which made a fantastic primer on the subject. I’ll link the first few of these here, as they might help with understanding the rest of the post. I highly recommend the entire series, even if you’re already pretty familiar with the subject.

Now, on to the pretty images! I’m still trying to determine whether it’s possible to embed applets in my posts, so that you can play with the program for yourself. Until I figure that out, GIFs will have to suffice.

lots of particles

Let me explain what’s going on in the image.

First of all, the vertical direction is time (up is the future, down is the past), and the horizontal direction is space (which is 1D for simplicity). What we’re looking at is the universe as described by an observer at a particular point in space and time. The point that this observer is at is right smack-dab in the center of the diagram, where the two black diagonal lines meet. These lines represent the observer’s light cone: the paths through spacetime that would be taken by beams of light emitted in either direction. And finally, the multicolored dots scattered in the upper quadrant represent other spacetime events in the observer’s future.

Now, what is being varied is the velocity of the observer. Again, keep in mind that the observer is not actually moving through time in this visualization. What is being shown is the way that other events would be arranged spatially and temporally if the observer had different velocities.

Take a second to reflect on how you would expect this diagram to look classically. Obviously the temporal positions of events would not depend upon your velocity. What about the spatial positions of events? Well, if you move to the right, events in your future and to the right of you should be nearer to you than they would be had you not been in motion. And similarly, events in your future left should be further to the left. We can easily visualize this by plugging in the classical Galilean transformation:

Classical Transformation.gif

Just as we expected, time positions stay constant and spatial positions shift according to your velocity! Positive velocity (moving to the right) moves future events to the left, and negative velocity moves them to the right. Now, technically this image is wrong. I’ve kept the light paths constant, but even these would shift under the classical transformation. In reality we’d get something like this:

Classical Corrected.gif

Of course, the empirical falsity of this prediction that the speed of light should vary according to your own velocity is what drove Einstein to formulate special relativity. Here’s what happens with just a few particles when we vary the velocity:

RGB Transform

What I love about this is how you can see so many effects in one short gif. First of all, the speed of light stays constant. That’s a good sign! A constant speed of light is pretty much the whole point of special relativity. Secondly, and incredibly bizarrely, the temporal positions of objects depend on your velocity!! Objects to your future right don’t just get further away spatially when you move away from them, they also get further away temporally!

Another thing that you can see in this visualization is the relativity of simultaneity. When the velocity is zero, Red and Blue are at the same moment of time. But if our velocity is greater than zero, Red falls behind Blue in temporal order. And if we travel at a negative velocity (to the left), then we would observe Red as occurring after Blue in time. In fact, you can find a velocity that makes any two of these three points simultaneous!

This leads to the next observation we can make: The temporal order of events is relative! The orderings of events that you can observe include Red-Green-Blue, Green-Red-Blue, Green-Blue-Red, and Blue-Green-Red. See if you can spot them all!

This is probably the most bonkers consequence of special relativity. In general, we cannot say without ambiguity that Event A occurred before or after Event B. The notion of an objective temporal ordering of events simply must be discarded if we are to hold onto the observation of a constant speed of light.

Are there any constraints on the possible temporal orderings of events? Or does special relativity commit us to having to say that from some valid frames of reference, the basketball going through the net preceded the throwing of the ball? Well, notice that above we didn’t get all possible orders… in particular we didn’t have Red-Blue-Green or Blue-Red-Green. It turns out that in general, there are some constraints we can place on temporal orderings.

Just for fun, we can add in the future light cones of each of the three events:

RGB with Light Cones.gif

Two things to notice: First, all three events are outside each others’ light cones. And second, no event ever crosses over into another event’s light cone. This makes some intuitive sense, and gives us a constant that will hold true in all reference frames: Events that are outside each others’ light cones from one perspective, are outside each others’ light cones from all perspectives. Same thing for events that are inside each others’ light cones.

Conceptually, events being inside each others’ light cones corresponds to them being in causal contact. So another way we can say this is that all observers will agree on what the possible causal relationships in the universe are. (For the purposes of this post, I’m completely disregarding the craziness that comes up when we consider quantum entanglement and “spooky action at a distance.”) 

Now, is it ever possible for events in causal contact to switch temporal order upon a change in reference frame? Or, in other words, could effects precede their causes? Let’s look at a diagram in which one event is contained inside the light cone of another:

RGB Causal

Looking at this visualization, it becomes quite obvious that this is just not possible! Blue is fully contained inside the future light cone of Red, and no matter what frame of reference we choose, it cannot escape this. Even though we haven’t formally proved it, I think that the visualization gives the beginnings of an intuition about why this is so. Let’s postulate this as another absolute truth: If Event A is contained within the light cone of Event B, all observers will agree on the temporal order of the two events. Or, in plainer language, there can be no controversy over whether a cause precedes its effects.

I’ll leave you with some pretty visualizations of hundreds of colorful events transforming as you change reference frames:

Pretty Transforms LQ

And finally, let’s trace out the set of possible space-time locations of each event.

Hyperbolas

Screen Shot 2018-12-06 at 3.22.43 PM.png

Try to guess what geometric shape these paths are! (They’re not parabolas.) Hint.

 

Fractals and Epicycles

There is no bilaterally-symmetrical, nor eccentrically-periodic curve used in any branch of astrophysics or observational astronomy which could not be smoothly plotted as the resultant motion of a point turning within a constellation of epicycles, finite in number, revolving around a fixed deferent.

Norwood Russell Hanson, “The Mathematical Power of Epicyclical Astronomy”

 

A friend recently showed me this image…

hilbert_epicycle.gif

…and thus I was drawn into the world of epicycles and fractals.

Epicycles were first used by the Greeks to reconcile observational data of the motions of the planets with the theory that all bodies orbit the Earth in perfect circles. It was found that epicycles allowed astronomers to retain their belief in perfectly circular orbits, as well as the centrality of Earth. The cost of this, however, was a system with many adjustable parameters (as many parameters as there were epicycles).

There’s a somewhat common trope about adding on endless epicycles to a theory, the idea being that by being overly flexible and accommodating of data you lose epistemic credibility. This happens to fit perfectly with my most recent posts on model selection and overfitting! The epicycle view of the solar system is one that is able to explain virtually any observational data. (There’s a pretty cool reason for this that has to do with the properties of Fourier series, but I won’t go into it.) The cost of this is a massive model with many parameters. The heliocentric model of the solar system, coupled with the Newtonian theory of gravity, turns out to be able to match all the same data with far fewer adjustable parameters. So by all of the model selection criteria we went over, it makes sense to switch over from one to the other.

Of course, it is not the case that we should have been able to tell a priori that an epicycle model of the planets’ motions was a bad idea. “Every planet orbits Earth on at most one epicycle”, for instance, is a perfectly reasonable scientific hypothesis… it just so happened that it didn’t fit the data. And adding epicycles to improve the fit to data is also not bad scientific practice, so long as you aren’t ignoring other equally good models with fewer parameters.)

Okay, enough blabbing. On to the pretty pictures! I was fascinated by the Hilbert curve drawn above, so I decided to write up a program of my own that generates custom fractal images from epicycles. Here are some gifs I created for your enjoyment:

Negative doubling of angular velocity

(Each arm rotates in the opposite direction of the previous arm, and at twice its angular velocity. The length of each arm is half that of the previous.)

negative_doubling

Trebling of angular velocity

trebling.gif

Negative trebling

negative_trebling

Here’s a still frame of the final product for N = 20 epicycles:

Screen Shot 2018-11-27 at 7.23.55 AM

Quadrupling

epicycles_frequency_quadrupling.gif

ωn ~ (n+1) 2n

(or, the Fractal Frog)

(n+1)*2^n.gif

ωn ~ n, rn ~ 1/n

radius ~ 1:n, frequency ~ n.gif

ωn ~ n, constant rn

singularity

ωn ~ 2n, rn ~ 1/n2

pincers

And here’s a still frame of N = 20:

high res pincers

(All animations were built using Processing.py, which I highly recommend for quick and easy construction of visualizations.)

Clarifying self-defeating beliefs

In a previous post, I mentioned self-defeating beliefs as a category that I am confused about. I wrote:

How should we reason about self defeating beliefs?

The classic self-defeating belief is “This statement is a lie.” If you believe it, then you are compelled to disbelieve it, eliminating the need to believe it in the first place. Broadly speaking, self-defeating beliefs are those that undermine the justifications for belief in them.

Here’s an example that might actually apply in the real world: Black holes glow. The process of emission is known as Hawking radiation. In principle, any configuration of particles with a mass less than the black hole can be emitted from it. Larger configurations are less likely to be emitted, but even configurations such as a human brain have a non-zero probability of being emitted. Henceforth, we will call such configurations black hole brains.

Now, imagine discovering some cosmological evidence that the era in which life can naturally arise on planets circling stars is finite, and that after this era there will be an infinite stretch of time during which all that exists are black holes and their radiation. In such a universe, the expected number of black hole brains produced is infinite (a tiny finite probability multiplied by an infinite stretch of time), while the expected number of “ordinary” brains produced is finite (assuming a finite spatial extent as well).

What this means is that discovering this cosmological evidence should give you an extremely strong boost in credence that you are a black hole brain. (Simply because most brains in your exact situation are black hole brains.) But most black hole brains have completely unreliable beliefs about their environment! They are produced by a stochastic process which cares nothing for producing brains with reliable beliefs. So if you believe that you are a black hole brain, then you should suddenly doubt all of your experiences and beliefs. In particular, you have no reason to think that the cosmological evidence you received was veridical at all!

I don’t know how to deal with this. It seems perfectly possible to find evidence for a scenario that suggests that we are black hole brains (I’d say that we havealready found such evidence, multiple times). But then it seems we have no way to rationally respond to this evidence! In fact, if we do a naive application of Bayes’ theorem here, we find that the probability of receiving any evidence in support of black hole brains to be 0!

So we have a few options. First, we could rule out any possible skeptical scenarios like black hole brains, as well as anything that could provide anyamount of evidence for them (no matter how tiny). Or we could accept the possibility of such scenarios but face paralysis upon actually encountering evidence for them! Both of these seem clearly wrong, but I don’t know what else to do.

A friend (whose blog Compassionate Equilibria you should definitely check out) left a comment in response, saying:

I think I feel somewhat less confused about self-defeating beliefs (at least when considering the black hole brain scenario maybe I would feel more confused about other cases).

It seems like the problem might be when you say “imagine discovering some cosmological evidence that the era in which life can naturally arise on planets circling stars is finite, and that after this era there will be an infinite stretch of time during which all that exists are black holes and their radiation.” Presumably, whatever experience you had that you are interpreting as this cosmological evidence is an experience that you would actually be very unlikely to have given that you exist in that universe and as a result shouldn’t be interpreted as evidence for existing in such a universe. Instead you would have to think about in what kind of universe would you be most likely to have those experiences that naively seemed to indicate living in a universe with an infinity of black hole brains.

This could be a very difficult question to answer but not totally intractable. This also doesn’t seem to rule out starting with a high prior in being a black hole brain and it seems like you might even be able to get evidence for being a black hole brain (although I’m not sure what this would be; maybe having a some crazy jumble of incoherent experiences while suddenly dying?).

I think this is a really good point that clears up a lot of my confusion on the topic. My response ended up being quite long, so I’ve decided to make it its own post.

 

*** My response starts here ***

 

The key point that I was stuck on before reading this comment was the notion that this argument puts a strong a priori constraint on the types of experiences we can expect to have. This is because P(E) is near zero when E strongly implies a theory and that theory undermines E.

Your point, which seems right, is: It’s not that it’s impossible or near impossible to observe certain things that appear to strongly suggest a cosmology with an infinity of black hole brains. It’s that we can observe these things, and they aren’t actually evidence for these cosmologies (for just the reasons you laid out).

That is, there just aren’t observations that provide evidence for radical skeptical scenarios. Observations that appear to provide such evidence, prove to not do so upon closer examination. It’s about the fact that the belief that you are a black hole brain is by construction unmotivateable: this is what it means to say P(E) ~ 0. (More precisely, the types of observations that actually provide evidence for black hole brains are those that are not undermined by the belief in black hole brains. Your “crazy jumble of incoherent experiences” might be a good example of this. And importantly, basically any scientific evidence of the sort that we think could adjudicate between different cosmological theories will be undermined.)

One more thing as I digest this: Previously I had been really disturbed by the idea that I’d heard mentioned by Sean Carroll and others that one criterion for a feasible cosmology is that it doesn’t end up making it highly likely that we are black hole brains. This seemed like a bizarrely strong a priori constraint on the types of theories we allow ourselves to consider. But this actually makes a lot of sense if conceived of not as an a priori constraint but as a combination of two things: (1) updating on the strong experiential evidence that we are not black hole brains (the extremely structured and self-consistent nature of our experiences) and (2) noticing that these theories are very difficult to motivate, as most pieces of evidence that intuitively seem to support them actually don’t upon closer examination.

So (1) the condition that P(E) is near zero is not necessarily a constraint on your possible experiences, and (2) it makes sense to treat cosmologies that imply that we are black hole brains as empirically unsound and nearly unmotivateable.

Now, I’m almost all the way there, but still have a few remaining hesitations.

One thing is that things get more confusing when you break an argument for black hole brains down into its component parts and try to figure out where exactly you went wrong. Like, say you already have a whole lot of evidence that after a finite length of time, the universe will be black holes forever, but don’t yet know about Hawking radiation. So far everything is fine. But now scientists observe Hawking radiation. From this they conclude that black holes radiate, though they don’t have a theory of the stochastic nature of the process that entails that it can in principle produce brains. They then notice that Hawking radiation is actually predicted by combining aspects of QM and GR, and see that this entails that black holes can produce brains. Now they have all the pieces that together imply that they are black hole brains, but at which step did they go wrong? And what should they conclude now? They appear to have developed a mountain of solid evidence that when put together (and combined with some anthropic reasoning) straightforwardly imply that they are black hole brains. But this can’t be the case, since this would undermine the evidence they started with.

We can frame this as a multilemma. The general reasoning process that leads to the conclusion that we are black hole brains might look like:

  1. We observe nature.
  2. We generate laws of physics from these observations.
  3. We predict from the laws of physics that there is a greater abundance of black hole brains than normal brains.
  4. We infer from (3) that we are black hole brains (via anthropic reasoning).

Either this process fails at some point, or we should believe that we are black hole brains. Our multilemma (five propositions, at least one of which must be accepted) is thus:

  1. Our observations of nature were invalid.
  2. Our observations were valid, but our inference of laws of physics from them was invalid.
  3. Our inference of laws of physics from our observations were valid, but our inference from these laws of there being a greater abundance of black hole brains than normal brains was invalid.
  4.  Our inference from the laws of there being a greater abundance of black hole brains from normal brains was valid, but the anthropic step was invalid.
  5. We are black hole brains.

Clearly we want to deny (5). I also would want to deny (3) and (4) – I’m imagining them to be fairly straightforward deductive steps. (1) is just some form of skepticism about our access to nature, which I also want to deny. The best choice, it looks like, is (2): our inductive inference of laws of physics from observations of nature is flawed in some way. But even this is a hard bullet to bite. It’s not sufficient to just say that other laws of physics might equally well or better explain the data. What is required is to say that in fact our observations don’t really provide compelling evidence for QM, GR, and so on.

So the end result is that I pretty much want to deny every possible way the process could have failed, while also denying the conclusion. But we have to deny something! This is clearly not okay!

Summing up: The remaining disturbing thing to me is that it seems totally possible to accidentally run into a situation where your best theories of physics inevitably imply (by a process of reasoning each step of which you accept is valid) that you are a black hole brain, and I’m not sure what to do next at that point.

More on quantum entanglement and irreducibility

A few posts ago, I talked about how quantum mechanics entails the existence of irreducible states – states of particles that in principle cannot be described as the product of their individual components. The classic example of such an entangled state is the two qubit state

Screen Shot 2018-07-17 at 8.03.53 PM

This state describes a system which is in an equal-probability superposition of both particles being |0 and both particles being |1. As it turns out, this state cannot be expressed as the product of two single-qubit states.

A friend of mine asked me a question about this that was good enough to deserve its own post in response. Start by imagining that Alice and Bob each have a coin. They each put their quarter inside a small box with heads facing up. Now they close their respective boxes, and shake them up in the exact same way. This is important! (as well as unrealistic) We suppose that whatever happens to the coin in Alice’s box, also happens to the coin in Bob’s box.

Now we have two boxes, each of which contains a coin, and these coins are guaranteed to be facing the same way. We just don’t know what way they are facing.

Alice and Bob pick up their boxes, being very careful to not disturb the states of their respective coins, and travel to opposite ends of the galaxy. The Milky Way is 100,000 light years across, so any communication between the two now would take a minimum of 100,000 years. But if Alice now opens her box, she instantly knows the state of Bob’s coin!

So while Alice and Bob cannot send messages about the state of their boxes any faster than 100,000 years, they can instantly receive information about each others’ boxes by just observing their own! Is this a contradiction?

No, of course not. While Alice does learn something about Bob’s box, this is not because of any message passed between the two. It is the result of the fact that in the past the configurations of their coins were carefully designed to be identical. So what seemed on its face to be special and interesting turns out to be no paradox at all.

Finally, we get to the question my friend asked. How is this any different from the case of entangled particles in quantum mechanics??

Both systems would be found to be in the states |00 and |11⟩ with equal probability (where |0⟩ is heads and |1⟩ is tails). And both have the property that learning the state of one instantly tells you the state of the other. Indeed, the coins-in-boxes system also has the property of irreducibility that we talked about before! Try as we might, we cannot coherently treat the system of both coins as the product of two independent coins, as doing so will ignore the statistical dependence between the two coins.

(Which, by the way, is exactly the sort of statistical dependence that justifies timeless decision theory and makes it a necessary update to decision theory.)

I love this question. The premise of the question is that we can construct a classical system that behaves in just the same supposedly weird ways that quantum systems behave, and thus make sense of all this mystery. And answering it requires that we get to the root of why quantum mechanics is a fundamentally different description of reality than anything classical.

So! I’ll describe the two primary disanalogies between entangled particles and “entangled” coins.

Epistemic Uncertainty vs Fundamental Indeterminacy

First disanalogy. With the coins, either they are both heads or they are both tails. There is an actual fact in the world about which of these two is true, and the probabilities we reference when we talk about the chance of HH or TT represent epistemic uncertainty. There is a true determinate state of the coins, and probability only arises as a way to deal with our imperfect knowledge.

On the other hand, according to the mainstream interpretation of quantum mechanics, the state of the two particles is fundamentally indeterminate. There isn’t a true fact out there waiting to be discovered about whether the state is |00⟩ or |11⟩. The actual state of the system is this unusual thing called a superposition of |00⟩ and |11⟩. When we observe it to be |00⟩, the state has now actually changed from the superposition to the determinate state.

We can phrase this in terms of counterfactuals: If when we look at the coins, we see that they are HH, then we know that they were HH all along. In particular, we know that if we had observed them a moment later or earlier, we would have gotten H with 100% certainty. Give that we actually observed HH, the probability that we would have observed HH is 100%.

But if we observe the state of the particles to be |00⟩, this does not mean that had we observed it a moment before, we would be guaranteed to get the same answer. Given that we actually observed |00⟩, the probability that we would have observed |00⟩ is still 50%.

(A project for some enterprising reader: see what the truths of these counterfactuals imply for an interpretation of quantum mechanics in terms of Pearl-style causal diagrams. Is it even possible to do?)

Predictive differences

The second difference between the two cases is a straightforward experimental difference. Suppose that Alice and Bob identically prepare thousands of coins as we described before, and also identically prepare thousands of entangled particles. They ensure that the coins are treated exactly the same way, so that they are guaranteed to all be in the same state, and similarly for the entangled pairs.

If they now just observe all of their entangled pairs and coins, they will get similar results – roughly half of the coins will be HH and roughly half of the entangled pairs will be |00⟩. But there are other experiments they could run on the entangled pairs that would give difference answers than 

The conclusion of this is that even if you tried to model the entangled pair as a simple probability distribution similar to the coins, you will get the wrong answer in some experiments. I described what these experiments could be in this earlier post – essentially they involve applying an operation that takes qubits in and out of superposition.

So we have both a theoretical argument and a practical argument for the difference between these two cases. They key take-away is the following:

According to quantum mechanics an entangled pair is in a state that is fundamentally indeterminate. When we describe it with probabilities, we are not saying “This probabilistic description is an account of my imperfect knowledge of the state of the system”. We’re saying that nature herself is undecided on what we will observe when we look at the state. (Side note: there is actually a way to describe epistemic uncertainty in quantum mechanics. It is called the density matrix, and is completely different from the description of superpositions.)

In addition, the most fundamental and accurate probability description for the state of the two particles is one that cannot be described as the product of two independent particles. This is not the case with the coins! The most fundamental and accurate probability description for the state of the two coins is either 100% HH or 100% TT (whichever turns out to be the case). What this means is that in the quantum case, not only is the state indeterminate, but the two particles are fundamentally interdependent – entangled. There is no independent description of the individual components of the system, there is only the system as a whole.

Communication through entanglement

Is it possible to use quantum entanglement to communicate faster than light?

Here’s a suggestion for how we might achieve just such a thing. Two people each possess one qubit of an entangled pair in the state ⟩:

Screen Shot 2018-07-17 at 8.03.53 PM

The owner of the second qubit then decides whether or not to apply some quantum gate U to their qubit. Immediately following this, the owner of the first qubit measures their qubit.

If the application of U to the second qubit changes the amplitude distribution over the first qubit, then the measurement can be used to communicate a message between the two people instantaneously! Why? Well, initially 0 and 1 are expected with equal probability. But if applying U makes these probabilities unequal, then the observation of a 0 or 1 carries evidence as to whether or not U was applied. (In an extreme case, applying U could make it guaranteed that the qubit would be observed as |0⟩, in which case observation of |1⟩ ensures that U was not applied.)

In this way, somebody could send information across by encoding it in a string of decisions about whether or not to apply U to a shared entangled pair.

It might seem a little strange that doing something to your qubit over here could affect the state of their qubit over there. But this is quantum mechanics, and quantum mechanics is very, very strange. Remember that the two qubits are entangled with one another. We are already guaranteed that what happens to one can affect the state of the other instantaneously – after all, if we measure the second qubit and find it in the state |0⟩, the first qubit’s state is instantaneously “collapsed” into the state |0⟩ as well. (This fact alone cannot be used to communicate messages, because before measuring the second qubit, its owner has no control over which of the two states it will end up in.)

So whether or not our scheme will work cannot be ruled out a priori. We must work out the math for ourselves to see if applying U to qubit 2 can successfully warp the probability distribution of qubit 1, thus sending information between the two instantaneously.

First, we’ll describe our single qubit gate U as a matrix.

Screen Shot 2018-07-23 at 1.04.00 AM.png

a, b, c, and d are complex numbers. Can they have any possible values?

No. U must preserve the normalization of states it acts on. In other words, for U to represent a physically possible transformation of a qubit, it cannot transform physically possible states into physically impossible states.

What precise constraints does this entail? It turns out that the following two suffice to ensure the normalization condition:

Screen Shot 2018-07-23 at 1.06.35 AM.png

Alright. Now, we have our general description of a single-qubit gate. Of course, the qubit that U is operating on is a part of an entangled pair. It is an irreducible component of a two-qubit system. So we can’t actually describe it as just a single qubit.

Instead, we need to describe the state of both qubits, which, I’ll remind you, looks like:

Screen Shot 2018-07-23 at 1.09.48 AM.png

A 2×2 matrix can’t operate on a vector with four components. What we need is a two-qubit quantum gate that corresponds to applying U to qubit 2 while leaving qubit 1 alone. “Leaving a qubit alone” is equivalent to applying the identity gate I to it, which just leaves the state unchanged.

Screen Shot 2018-07-23 at 1.13.42 AM.png

So what we really want is the 4×4 matrix that corresponds to applying U to qubit 2 and I to qubit 1. It turns out that we can generate this matrix by simply taking the tensor product of U with I:

Screen Shot 2018-07-23 at 1.17.56 AM.png

Alright, now we’re ready to see what happens when we apply this gate to ⟩!

Screen Shot 2018-07-23 at 1.21.19 AM.png

This state sure looks different than the state we started with! But is it different enough to have carried some information between the qubits? Let’s now look at the probabilities for measuring qubit 1 in the states 0 and 1:

Screen Shot 2018-07-23 at 1.26.11 AM.png

Screen Shot 2018-07-23 at 1.29.32 AM.png

Screen Shot 2018-07-23 at 1.31.24 AM.png

Remember the constraints on the possible values of a, b, c, and d we started with?

Screen Shot 2018-07-23 at 1.32.56 AM.png

Which leads us to the final answer!

Screen Shot 2018-07-23 at 1.34.53 AM.png

Sadly, it looks like our method won’t work to produce faster-than-light communication. No matter what gate we apply to the second qubit, it has no effect on the observed probabilities of the first. And therefore, no information can be sent by applying U.

Of course, this does not rule out all possible ways to attempt to utilize entanglement to communicate faster-than-light. But it does provide a powerful demonstration of the way in which such attempts are defeated by the laws of quantum mechanics.