A sequence prediction puzzle

What comes next?

Screen Shot 2019-05-27 at 9.39.32 PM

Buddhabrot, Mandelbrot’s older sibling

Many people have heard of Mandelbrot’s famous set. But far fewer have heard of its older sibling: the Buddhabrot. I find this object even more beautiful than the Mandelbrot set, and want to give it its proper appreciation here.

To start out with: What is the Mandelbrot set? It is defined as the set of complex values c that stay finite under arbitrary many repeated applications of the function fc(z) = z2 + c (where the first iteration is applied to z = 0).

So, for example, to check if the number 2 is in the Mandelbrot set, we just look at what happens when we plug in 0 to the function f2 and repeat:

0 becomes 02 + 2 = 2
2 becomes 22 + 2 = 6
6 becomes 62 + 2 = 38
38 becomes 382 + 2 = something large, and on and on.

You can probably predict that as we keep iterating, we’re going to eventually run off to infinity. So 2 is not in the set.

On the other hand, see what happens when we plug in -1:

0 becomes 02 + (-1) = -1
-1 becomes (-1)2 + (-1) = 0
0 becomes -1
-1 becomes 0
… and so on to infinity

With -1, we just bounce around back and forth between 0 and -1, so we clearly never diverge to infinity.

Now we can draw the elements of this set pretty easily, by simply shading in the numbers on the complex plane that are in the set. If you do so, you get the following visualization:

overallmandelbrot

Cool! But what about all those colorful visualizations you’ve probably seen? (maybe even on this blog!)

Well, whether you’re in the Mandelbrot set or not is just a binary property. So by itself the Mandelbrot set doesn’t have a rich enough structure to account for all those pretty colors. What’s being visualized in those pictures is not just whether an element is in the Mandelbrot set, but also, if it’s not in the set (i.e. if the 0 diverges to infinity upon repeated applications of fc), how quickly it leaves the set!

In other words, the colors are a representation of how not in the set numbers are. There are some complex numbers that exist near the edge of the Mandelbrot that hang around the set for many many iterations before finally blowing up and running off to infinity. And these numbers will be colored differently from, say, the number “2”, which right away starts blowing up.

And that’s how you end up with pictures like the following:

Screen Shot 2019-05-22 at 3.37.11 PM.png

Now, what if instead of visualizing how in the set various complex numbers are, we instead look at what complex numbers are most visited on average when running through Mandelbrot iterations? Well, that’s how we get the Buddhabrot!

Specifically, here’s a procedure we could run:

  1. Choose a random complex number c.
  2. Define z = 0
  3. Update: z = fc(z)
  4. Give one unit of credit to whatever complex number z is now.
  5. Repeat 3 and 4 for some fixed number of iterations.
  6. Start over back at 1 with a new complex number.

As you run this algorithm, you can visualize the results by giving complex numbers with more credits brighter colors. And as you do this, a curious figure begins to appear on the (rotated) complex plane:

Screen Shot 2019-05-22 at 4.23.06 PM.png

 

 

That’s right, Buddha is hanging out on the complex plane, hiding in the structure of the Mandelbrot set!

Hopping Midpoints

Put down three points on a piece of paper. Choose one of them as your “starting point”. Now, randomly choose one of the three points and hop from your starting point, halfway over to the chosen point. Mark down where you’ve landed. Then repeat: randomly choose one of the three starting points, and move halfway from your newly marked point to this new chosen point. Mark where you land. And on, and on, to infinity.

What pattern will arise? Watch and see!

Controls:
E to increase points/second.
Q to decrease points/second.
Click and drag the red points to move them around.
Pressing a number key will make a polygon with that number of sides.
[pjs4wp]
float N = 3;
float[] X = new float(N);
float[] Y = new float(N);
float radius = 600/2 – 20;
float i = 0;
while (N > i)
{
X[i] = radius * cos(2*PI*i/N – PI/2 + 2*PI/N);
Y[i] = radius * sin(2*PI*i/N – PI/2 + 2*PI/N);
i += 1;
}
float xNow = X[0];
float yNow = Y[0];
float speed = 1;
int selected = -1;
void setup()
{
size(600,600);
frameRate(10);
background(0);
}
void draw()
{
fill(255);
stroke(255);
text((str)(speed*10) + ” points / second\nE to speed up\nQ to slow down\nClick and drag red points!”,15,25);
translate(width/2, height/2);
float i = 0;
while(i < speed) { point(xNow, yNow); index = (int)(random(N)); xNow = (xNow + X[index])/2; yNow = (yNow + Y[index])/2; i += 1; } stroke(color(255,0,0)); fill(color(255,0,0)); float j = 0; while (N > j)
{
ellipse(X[j],Y[j],10,10);
j += 1;
}
}
void keyReleased()
{
bool reset = 0;
if (key == ‘e’) speed *= 10;
else if (key == ‘q’ && speed > 1) speed /= 10;
else if (key == ‘2’) N = 2;
else if (key == ‘3’) N = 3;
else if (key == ‘4’) N = 4;
else if (key == ‘5’) N = 5;
else if (key == ‘6’) N = 6;
else if (key == ‘7’) N = 7;
else if (key == ‘8’) N = 8;
else if (key == ‘9’) N = 9;
else reset = 1;
if (reset == 0)
{
background(0);
float i = 0;
while (N > i)
{
X[i] = radius * cos(2*PI*i/N – PI/2);
Y[i] = radius * sin(2*PI*i/N – PI/2);
i += 1;
}
xNow = X[0];
yNow = Y[0];
}
}
void mousePressed()
{
if (selected == -1)
{
float i = 0;
while (N > i)
{
if ((X[i] + 5 >= mouseX – width/2) && (mouseX – width/2 >= X[i] – 5))
if ((Y[i] + 5 >= mouseY – height/2) && (mouseY – height/2 >= Y[i] – 5))
selected = i;
i += 1;
}
}
}
void mouseReleased()
{
if (selected != -1)
{
X[selected] = mouseX – width/2;
Y[selected] = mouseY – height/2;
selected = -1;
xNow = X[0];
yNow = Y[0];
background(0);
}
}
[/pjs4wp]

Exploring Julia Sets

Here’s a natural follow-up to my last post on the Mandelbrot set – an interactive Julia set explorer!

The Julia set corresponding to a particular point c = x + iy in the complex plane is defined as the set of complex numbers z that stay finite upon arbitrary iterations of the following function: fc(z) = z2 + c. The Mandelbrot set, by comparison, is defined as the set of complex numbers c such that the value obtained by starting with 0 and iterating the function fc arbitrarily many times converges.

What’s remarkable is now beautiful and complex the patterns that arise from this simple equation are. Take a look for yourself: just hover over a point to see its corresponding Julia set!

[pjs4wp]
float xmin = -1.8;
float xmax = 1.8;
float ymin = -1.2;
float ymax = 1.2;
float resolution = 30;
void setup()
{
size(600,400);
background(0);
stroke(255);
}
void draw()
{
background(0);
float cx = xmin + (xmax-xmin)*(float)(mouseX)/(float)(width);
float cy = ymin + (ymax-ymin)*(float)(mouseY)/(float)(height);
drawJuliaSet(cx, cy);
stroke(255);
fill(255);
line(-xmin*width/(xmax-xmin),0,-xmin*width/(xmax-xmin),height);
line(0,-ymin*height/(ymax-ymin),width,-ymin*height/(ymax-ymin));
}
void keyPressed()
{
if (key == ‘e’) resolution += 10;
if (key == ‘q’) resolution -= 10;
if (key == ‘ ‘) resolution = 30;
}
void drawJuliaSet(float cx, float cy)
{
text(“(” + (str)((int)(cx*100)/100.) + “,” + (str)(-(int)(cy*100)/100.) + “)”, 20, 20);
float i = 0;
float j = 0;
while (height > j)
{
x = xmin + (xmax-xmin)*(float)i/width;
y = ymin + (ymax-ymin)*(float)j/height;
float step = 0;
while (step < resolution) { float xNew = x*x - y*y + cx; y = 2*x*y + cy; x = xNew; if (x*x + y*y > 4) break;
step += 1;
}
stroke(color(0,255,0));
fill(color(0,255,0));
if (4 > x*x + y*y) point(i,j);
i += 1;
if (i == width)
{
i = 0;
j += 1;
}
}
stroke(0);
fill(color(255,0,0));
ellipse((cx-xmin)*width/(xmax-xmin),(cy-ymin)*height/(ymax-ymin),10,10);
}
[/pjs4wp]
Resolution is preset at a value good for seeing lots of details and loading at a reasonable speed, but should you want to change it, controls are ‘E’ to increase it and ‘Q’ to decrease it. To reset to default, press ‘SPACE’.

Mandelbrot Exploration

I finally figured out how to embed programs that I write onto these blog posts! I’m really excited about the way this will expand my ability to describe and explain all sorts of topics in the future. I’ll inaugurate this new ability with an old classic: a program that allows you to explore the Mandelbrot set. I’ve preset it with a pretty good resolution limit that allows for nice viewing as well as not too slow load times, but you can adjust it as you please! Enjoy!

Controls:


UP or W to zoom in

DOWN or S to zoom out

CLICK to center view on the clicked point

E to double resolution (mashing this will slow things down a LOT, so use sparingly)

Q to halve resolution

SPACE to reset everything

[pjs4wp]
double xmin = -2.25;
double xmax = .75;
double ymin = -1.1;
double ymax = 1.1;
double resolution = 200;
double pathLength = 1000;
void setup()
{
size(600,400);
background(0);
colorMode(HSB, 255, 255, 255);
drawSet();
}
void draw()
{
double cx = xmin + (xmax – xmin)*mouseX/width;
double cy = ymin + (ymax – ymin)*mouseY/height;
double x = 0;
double y = 0;
double path = 0;
stroke(255);
while (path < pathLength) { point(x,y); double yNew = 2*x*y + cy; x = x*x - y*y + cx; y = yNew; path += 1; } } void drawSet() { background(0); double xpos = 0; double ypos = 0; while(ypos < 400) { double cx = xmin + (xmax - xmin)*xpos/width; double cy = ymin + (ymax - ymin)*ypos/height; double x = 0; double y = 0; double step = 0; while(step < resolution) { double yNew = 2*x*y + cy; x = x*x - y*y + cx; y = yNew; step += 1; if(x*x + y*y > 4)
{
break;
}
}
if(4 > x*x + y*y)
{
stroke(0);
}
else
{
stroke(color(255*step/resolution,255,255));
}
point(xpos,ypos);
xpos += 1;
if (xpos == 600)
{
xpos = 0;
ypos += 1;
}
}
}
void mouseReleased()
{
double dx = xmax – xmin;
double dy = ymax – ymin;
double cx = xmin + dx*mouseX/width;
double cy = ymin + dy*mouseY/height;
xmin = cx – dx/2;
xmax = cx + dx/2;
ymin = cy – dy/2;
ymax = cy + dy/2;
drawSet();
}
void keyPressed()
{
double dx = xmax – xmin;
double dy = ymax – ymin;
if (keyCode == UP || key == ‘w’)
{
xmin += .25*dx;
xmax -= .25*dx;
ymin += .25*dy;
ymax -= .25*dy;
drawSet();
}
if (keyCode == DOWN || key == ‘s’)
{
xmin -= .5*dx;
xmax += .5*dx;
ymin -= .5*dy;
ymax += .5*dy;
drawSet();
}
if (key == ‘ ‘)
{
xmin = -2.25;
xmax = .75;
ymin = -1.1;
ymax = 1.1;
resolution = 200;
drawSet();
}
if (key == ‘e’)
{
resolution *= 2;
drawSet();
}
if (key == ‘q’)
{
resolution /= 2;
drawSet();
}
}
[/pjs4wp]

Try zooming in on a region near the edge of the set where there are black regions and increasing the resolution. “Resolution” here really means “the number of Mandelbrot iterations run to test if the point diverges or converges”, and the black points are those that were judged to be convergent. So some of these points might be found to be no longer convergent with increasing resolution! Which is why you might notice black regions suddenly filling up with color if you increase the resolution. (Of course, some regions are truly convergent, like the central chunk of the Mandelbrot set around (0,0), so such points’ colors will not depend on the resolution.)

If you’re confused about what this image is, I recommend checking out Numberphile’s video on the Mandelbrot set. (Actually, I recommend it even if you aren’t confused.)

Hopping Midpoints and Mathematical Snowflakes

Let me start this off by saying that if you’re reading this blog and haven’t ever checked out the Youtube channel Numberphile, you need to go there right away and start bingeing their videos. That’s what I’ve been doing for the last few days, and it’s given me tons of cool new puzzles to consider.

Here’s one:

Naturally, after watching this video I wanted to try this out for myself. Here you see the pattern arising beautifully from the randomness:

Serpinski

I urge you to think hard about why the Sierpinski triangle would arise from something as simple as randomly hopping between midpoints. It’s very non-obvious, and although I have a few ideas, I’m still missing a clear intuition.

I also made some visualizations for other shapes. I’ll show some of them, but encourage you to make predictions about what pattern you’d expect to see before scrolling down to see the actual result.

First:

Square

Instead of three points arranged as above, we will start out with four points arranged in a perfect square. Then, as before, we’ll jump from our starting point halfway to one of these four, and will continue this procedure ad infinitum.

What pattern will arise? Do you think that we’ll have “missing regions” where no points can land, like with the triangle?

Scroll down to see the answer…

(…)

(…)

(…)

Square

Okay! So it looks like the whole square gets filled out, with no missing regions. This was pretty surprising to me; given that three points gave rise to a intricate fractal pattern, why wouldn’t four points do the same? What’s special about “3” .

Well, perhaps things will be different if we tweak the positions of the corners slightly? Will any quadrilateral have the same behavior of filling out all the points, or will the blank regions re-arise? Again, make a prediction!

Quadrilaterals

Let’s see:

4 TemplesCross4 Double Temples4 Rolos

Okay, now we see that apparently the square was actually a very special case! Pretty much any quadrilateral we can construct will give us a nested infinity of blank regions, as long as at least one angle is not equal to 90º. Again, this is fascinating and puzzling to me. Why do 90º angles invariably cause the whole region to fill out? I’m not sure.

Let’s move on to a pentagon! Do you think that a regular pentagon will behave more like a triangle or a square?

Pentagon

Take a look…

Pentagon

And naturally, the next question is what about a hexagon?

Hexagon

Hexagon

Notice the difference between the hexagon and all the previous ones! Rather than having small areas of points that are never reached, it appears that suddenly we get lines! Again, I encourage you to try to think about why this might be (what’s so special about 6?) and leave a comment if you have any ideas.

Now, I because curious about what other types of patterns we can generate with simple rules like these. I wondered what would happen if instead of simply jumping to the average of the current point and a randomly chosen point, we built a pattern with some “memory”. For instance, what if we didn’t just look at the current point and the randomly chosen point, but also at the last chosen point? We could then take the middle of the triangle formed by these three points as our new point.

It turns out that the patterns that arise from this are even more beautiful than the previous ones! (In my opinion, of course)

Take a look:

Triangle

Threes Triangle

Square

Threes Square

Pentagon

Threes Pentagon

Hexagon

Threes Hexagon

Heptagon

Threes Heptagon

I’ll stop here, but this is a great example of how beautiful and surprising math can be. I would have never guessed that such intricate fractal patterns would arise from such simple random rules.

A probability puzzle

probpuzzle.jpg

To be totally clear: the question is not assuming that there is ONLY one student whose neighbors both flipped heads, just that there is AT LEAST one such student. You can imagine that the teacher first asks for all students whose neighbors both flipped heads to step forward, then randomly selected one of the students that had stepped forward.

Now, take a minute to think about this before reading on…

It seemed initially obvious to me that the teacher was correct. There are exactly as many possible worlds in which the three students are HTH as there worlds in which they are HHH, right? Knowing how your neighbors’ coins landed shouldn’t give you any information about how your own coin landed, and to think otherwise seems akin to the Gambler’s fallacy.

But in fact, the teacher is wrong! It is in fact more likely that the student flipped tails than heads! Why? Let’s simplify the problem.

Suppose there are just three students standing in a circle (/triangle). There are eight possible ways that their coins might have landed, namely:

HHH
HHT
HTH
HTT
THH
THT
TTH
TTT

Now, the teacher asks all those students whose neighbors both have “H” to step forward, and AT LEAST ONE steps forward. What does this tell us about the possible world we’re in? Well, it rules out all of the worlds in which no student could be surrounded by both ‘H’, namely… TTT, TTH, THT, and HTT. We’re left with the following…

HHH
HHT
HTH
THH

One thing to notice is that we’re left with mostly worlds with lots of heads. The expected total of heads is 2.25, while the expected total of tails is just 0.75. So maybe we should expect that the student is actually more likely to have heads than tails!

But this is wrong. What we want to see is what proportion of those surrounded by heads are heads in each possible world.

HHH: 3/3 have H (100%)
HHT: 0/1 have H (0%)
HTH: 0/1 have H (0%)
THH: 0/1 have H (0%)

Since each of these worlds is equally likely, what we end up with is a 25% chance of 100% heads, and a 75% chance of 0% heads. In other words, our credence in the student having heads should be just 25%!

Now, what about for N students? I wrote a program that does a brute-force calculation of the final answer for any N, and here’s what you get:

N

cr(heads)

~

3

1/4

0.25

4

3/7

0.4286

5

4/9

0.4444

6

13/32

0.4063

7

1213/2970

0.4084

8

6479/15260

0.4209

9

10763/25284

0.4246

10

998993/2329740

0.4257

11

24461/56580

0.4323

12

11567641/26580015

0.4352

13

1122812/2564595

0.4378

14

20767139/47153106

0.4404

15

114861079/259324065

0.4430

16

2557308958/5743282545

0.4453

17

70667521/157922688

0.4475

These numbers are not very pretty, though they appear to be gradually converging (I’d guess to 50%).

Can anybody see any patterns here? Or some simple intuitive way to arrive at these numbers?

 

Solving the common knowledge puzzle

Read this post and give it a try yourself before reading on! Spoilers ahead.

I’ve written up the way that I solved the puzzle, step by step:

1. A looks at B and C and says “I don’t know what my number is.”

We assess what information this gives us by considering in which scenario A would have known what their number was, and ruling it out.

Well, for starters, if A saw two different numbers, then A would consider there to be two logically possible worlds, one in which they are the sum of the two numbers, and another in which they are one of the two that are added together. But if A saw the SAME two numbers, they A would know that they must be the sum of those two (since zero is not a possible value for themselves). What this means is that the fact that A doesn’t know their number tells us with certainty that B ≠ C! Furthermore, since B and C will go through this same line of reasoning themselves, they will also know that B ≠ C. And since all of them know that B and C will go through the same line of reasoning, it becomes common knowledge that B ≠ C.

Good! So after A’s statement, we have added one piece of common knowledge, namely that B ≠ C.

2. B thinks a moment and then says “Me neither.”

Ok, one thing this tells us is that A ≠ C, for the exact same reason as before.

But it also tells us more than this, because B knows that B ≠ C and still doesn’t know. So we just need to think of the scenario in which knowing that B ≠ C (and knowing the values of A and C, of course) would tell B the value that they have. Try to figure it out for yourselves!

The answer is, the scenario is that A = 2C! Imagine that B sees that A is 10 and C is 5. This tells them that they are either 5 or 15. But they know they can’t be 5, because C is five and B ≠ C. So they must be 15! In other words, since they would know their value if A equaled 2C and they don’t know their value, this tells us that A ≠ 2C!

So now we have two more pieces of common knowledge: A ≠ C, and A ≠ 2C. Putting this together with what we knew before, we have a total of three pieces of information:

B ≠ C
A ≠ C
A ≠ 2C

3. C thinks a moment and then says “Me neither.”

By exact analogy with the previous arguments, this tells us that A ≠ B, as well as that A ≠ 2B and B ≠ 2A. (We saw previously that B could conclude from B ≠ C that A ≠ 2C. By the same arguments, C can conclude from C ≠ A that B ≠ 2A. And from C ≠ B, C can conclude that A ≠ 2B.)

There’s one more piece of information that we haven’t taken into account, which is that A ≠ 2C. In which situation does the fact that A ≠ 2C tell C their value? Well, if A = 10 and B = 15, then C is either 5 or 20. But C can’t be 5, because then A would be twice C. So C could conclude that they are 20. Since C doesn’t conclude this, we know that 3A ≠ 2B.

Putting it all together, we know the following:

B ≠ C
A ≠ C
A ≠ 2C
A ≠ 2B
B ≠ 2A
3A ≠ 2B

4. A thinks, and says: “Now I know! My number is 25.”

The question we need to ask ourselves is what the values of B and C must be in order that the above six conditions to allow A to figure out their own value. Pause to think about this before reading on…

 

 

 

Let’s work through how A processes each piece of information:

A could figure out their own value by seeing B = C. But we already know that this isn’t the case.

Since A knows that A ≠ C, A could figure out their value by seeing B = 2C. So that’s a possibility… Except that if B = 2C, then A = B + 2C = 3C. But 25 is not divisible by 3. So this can’t be what they saw.

Since A knows that A ≠ B, A could figure out their value by seeing C = 2B. But again, this doesn’t work, since it would imply that A was divisible by 3, which it is not.

Since A knows that A ≠ 2C, A could figure out their value by seeing B = 3C (e.g. B = 15, C = 5). They would rule out themselves being one component of the sum, and conclude that they are 4C. But 25 is not divisible by 4. So this is not the case.

Since A knows that A ≠ 2B, A could figure out their value by seeing C = 3B (e.g. B = 5, C = 15). By the same reasoning as before, this cannot be the case.

Since B ≠ 2A, A could figure out their value by seeing 3B = 2C (e.g. B = 10, C = 15). They would know that they cannot be just one component of the sum, so they would conclude that they must be B + C, or 2.5 B. Now, is there an integer B such that 25 = 2.5 B? You betcha! B = 10, and C = 15!

We can stop here, since we’ve found a logically consistent world in which A figures out that their own value is 25. Since there can only be one such world (as the problem statement implies that this information is enough to solve the puzzle), we know that this must be what they saw. So we’ve found the answer! (A,B,C) = (25,10,15). But if you’re curious, I’d suggest you go through the rest of the cases and show that no other values of B and C would be consistent with them knowing that their own value is 25.

One thing that’s interesting here is the big role that the number 25 played in this. The fact that 25 was not divisible by 3 but was divisible by 2.5, was crucial. For the same puzzle but a different value that 25, we would have come to a totally different answer!

My challenge to anybody that’s made it this far: Consider the set of all integers that A could have truthfully declared that they knew themselves to be. For some such integers, it won’t be the case that A’s declaration is sufficient for us to conclude what B and C are. Which integers are these?

A common knowledge puzzle

Common knowledge puzzles are my favorite. Here’s one I just came across. I challenge you to try to figure it out in less than 5 minutes. 🙂

Three perfect logicians with positive (non-zero) integers taped to their foreheads, A, B, and C, sit in a triangle. Each doesn’t know their own number but can see the numbers for the other two. It is common knowledge amongst all three that one of the numbers is the sum of the other two, but it is not known which is which.

A looks at B and C, and says “I don’t know what my number is.”

B thinks a moment and then says “Me neither.”

C thinks a moment and then says “Me neither.”

A thinks, then says “Now I know! My number is 25.”

What are the numbers on B and C’s foreheads?

(Solution here)

Kant’s attempt to save metaphysics and causality from Hume

TL;DR

  • Hume sort of wrecked metaphysics. This inspired Kant to try and save it.
  • Hume thought that terms were only meaningful insofar as they were derived from experience.
  • We never actually experience necessary connections between events, we just see correlations. So Hume thought that the idea of causality as necessary connection is empty and confused, and that all our idea of causality really amounts to is correlation.
  • Kant didn’t like this. He wanted to PROTECT causality. But how??
  • Kant said that metaphysical knowledge was both a priori and substantive, and justified this by describing these things called pure intuitions and pure concepts.
  • Intuitions are representations of things (like sense perceptions). Pure intuitions are the necessary preconditions for us to represent things at all.
  • Concepts are classifications of representations (like “red”). Pure concepts are the necessary preconditions underlying all classifications of representations.
  • There are two pure intuitions (space and time) and twelve pure concepts (one of which is causality).
  • We get substantive a priori knowledge by referring to pure intuitions (mathematics) or pure concepts (laws of nature, metaphysics).
  • Yay! We saved metaphysics!

 

(Okay, now on to the actual essay. This was not originally written for this blog, which is why it’s significantly more formal than my usual fare.)

 

***

 

David Hume’s Enquiry Into Human Understanding stands out as a profound and original challenge to the validity of metaphysical knowledge. Part of the historical legacy of this work is its effect on Kant, who describes Hume as being responsible for [interrupting] my dogmatic slumber and [giving] my investigations in the field of speculative philosophy a completely different direction.” Despite the great inspiration that Kant took from Hume’s writing, their thinking on many matters is diametrically opposed. A prime example of this is their views on causality.

Hume’s take on causation is famously unintuitive. He gives a deflationary account of the concept, arguing that the traditional conception lacks a solid epistemic foundation and must be replaced by mere correlation. To understand this conclusion, we need to back up and consider the goal and methodology of the Enquiry.

He starts with an appeal to the importance of careful and accurate reasoning in all areas of human life, and especially in philosophy. In a beautiful bit of prose, he warns against the danger of being overwhelmed by popular superstition and religious prejudice when casting one’s mind towards the especially difficult and abstruse questions of metaphysics.

But this obscurity in the profound and abstract philosophy is objected to, not only as painful and fatiguing, but as the inevitable source of uncertainty and error. Here indeed lies the most just and most plausible objection against a considerable part of metaphysics, that they are not properly a science, but arise either from the fruitless efforts of human vanity, which would penetrate into subjects utterly inaccessible to the understanding, or from the craft of popular superstitions, which, being unable to defend themselves on fair ground, raise these entangling brambles to cover and protect their weakness. Chased from the open country, these robbers fly into the forest, and lie in wait to break in upon every unguarded avenue of the mind, and overwhelm it with religious fears and prejudices. The stoutest antagonist, if he remit his watch a moment, is oppressed. And many, through cowardice and folly, open the gates to the enemies, and willingly receive them with reverence and submission, as their legal sovereigns.

In less poetic terms, Hume’s worry about metaphysics is that its difficulty and abstruseness makes its practitioners vulnerable to flawed reasoning. Even worse, the difficulty serves to make the subject all the more tempting for “each adventurous genius[, who] will still leap at the arduous prize and find himself stimulated, rather than discouraged by the failures of his predecessors, while he hopes that the glory of achieving so hard an adventure is reserved for him alone.”

Thus, says Hume, the only solution is “to free learning at once from these abstruse questions [by inquiring] seriously into the nature of human understanding and [showing], from an exact analysis of its powers and capacity, that it is by no means fitted for such remote and abstruse questions.”

Here we get the first major divergence between Kant and Hume. Kant doesn’t share Hume’s eagerness to banish metaphysics. His Prolegomena To Any Future Metaphysics and Critique of Pure Reason are attempts to find it a safe haven from Hume’s attacks. However, while Kant might not be similarly constituted to Hume in this way, he does take Hume’s methodology very seriously. He states in the preface to the Prolegomena that “since the origin of metaphysics as far as history reaches, nothing has ever happened which could have been more decisive to its fate than the attack made upon it by David Hume.” Many of the principles which Hume derives, Kant agrees with wholeheartedly, making the task of shielding metaphysics even harder for him.

With that understanding of Hume’s methodology in mind, let’s look at how he argues for his view of causality. We’ll start with a distinction that is central to Hume’s philosophy: that between ideas and impressions. The difference between the memory of a sensation and the sensation itself is a good example. While the memory may mimic or copy the sensation, it can never reach its full force and vivacity. In general, Hume suggests that our experiences fall into two distinct categories, separated by a qualitative gap in liveliness. The more lively category he calls impressions, which includes sensory perceptions like the smell of a rose or the taste of wine, as well as internal experiences like the feeling of love or anger. The less lively category he refers to as thoughts or ideas. These include memories of impressions as well as imagined scenes, concepts, and abstract thoughts. 

With this distinction in hand, Hume proposes his first limit on the human mind. He claims that no matter how creative or original you are, all of your thoughts are the product of “compounding, transposing, augmenting, or diminishing the materials afforded us by the senses and experiences.” This is the copy principle: all ideas are copies of impressions, or compositions of simpler ideas that are in turn copies of impressions.

Hume turns this observation of the nature of our mind into a powerful criterion of meaning. “When we entertain any suspicion that a philosophical term is employed without any meaning or idea (as is but too frequent), we need but enquire, From what impression is that supposed idea derived? And if it be impossible to assign any, this will serve to confirm our suspicion.

This criterion turns out to be just the tool Hume needs in order to establish his conclusion. He examines the traditional conception of causation as a necessary connection between events, searches for the impressions that might correspond to this idea, and, failing to find anything satisfactory, declares that “we have no idea of connection or power at all and that these words are absolutely without any meaning when employed in either philosophical reasonings or common life.” His primary argument here is that all of our observations are of mere correlation, and that we can never actually observe a necessary connection.

Interestingly, at this point he refrains from recommending that we throw out the term causation. Instead he proposes a redefinition of the term, suggesting a more subtle interpretation of his criterion of meaning. Rather than eliminating the concept altogether upon discovering it to have no satisfactory basis in experience, he reconceives it in terms of the impressions from which it is actually formed. In particular, he argues that our idea of causation is really based on “the connection which we feel in the mind, this customary transition of the imagination from one object to its usual attendant.”

Here Hume is saying that humans have a rationally unjustifiable habit of thought where, when we repeatedly observe one type of event followed by another, we begin to call the first a cause and the second its effect, and we expect that future instances of the cause will be followed by future instances of the effect. Causation, then, is just this constant conjunction between events, and our mind’s habit of projecting the conjunction into the future. We can summarize all of this in a few lines:

Hume’s denial of the traditional concept of causation

  1. Ideas are always either copies of impressions or composites of simpler ideas that are copies of impressions.
  2. The traditional conception of causation is neither of these.
  3. So we have no idea of the traditional conception of causation.

Hume’s reconceptualization of causation

  1. An idea is the idea of the impression that it is a copy of.
  2. The idea of causation is copied from the impression of constant conjunction.
  3. So the idea of causation is just the idea of constant conjunction.

There we have Hume’s line of reasoning, which provoked Kant to examine the foundations of metaphysics anew. Kant wanted to resist Hume’s dismissal of the traditional conception of causation, while accepting that our sense perceptions reveal no necessary connections to us. Thus his strategy was to deny the Copy Principle and give an account of how we can have substantive knowledge that is not ultimately traceable to impressions. He does this by introducing the analytic/synthetic distinction and the notion of a priori synthetic knowledge.

Kant’s original definition of analytic judgments is that they “express nothing in the predicate but what has already been actually thought in the concept of the subject.” This suggests that the truth value of an analytic judgment is determined by purely the meanings of the concepts in use. A standard example of this is “All bachelors are unmarried.” The truth of this statement follows immediately just by understanding what it means, as the concept of bachelor already contains the predicate unmarried.  Synthetic judgments, on the other hand, are not fixed in truth value by merely the meanings of the concepts in use. These judgments amplify our knowledge and bring us to genuinely new conclusions about our concepts. An example: “The President is ornery.” This certainly doesn’t follow by definition; you’d have to go out and watch the news to realize its truth.

We can now put the challenge to metaphysics slightly differently. Metaphysics purports to be discovering truths that are both necessary (and therefore a priori) as well as substantive (adding to our concepts and thus synthetic). But this category of synthetic a priori judgments seems a bit mysterious. Evidently, the truth values of such judgments can be determined without referring to experience, but can’t be determined by merely the meanings of the relevant concepts. So apparently something further is required besides the meanings of concepts in order to make a synthetic a priori judgment. What is this thing?

Kant’s answer is that the further requirement is pure intuition and pure concepts. These terms need explanation.

Pure Intuitions

For Kant, an intuition is a direct, immediate representation of an object. An obvious example of this is sense perception; looking at a cup gives you a direct and immediate representation of an object, namely, the cup. But pure intuitions must be independent of experience, or else judgments based on them would not be a priori. In other words, the only type of intuition that could possibly be a priori is one that is present in all possible perceptions, so that its existence is not contingent upon what perceptions are being had. Kant claims that this is only possible if pure intuitions represent the necessary preconditions for the possibility of perception.

What are these necessary preconditions? Kant famously claimed that the only two are space and time. This implies that all of our perceptions have spatiotemporal features, and indeed that perception is only possible in virtue of the existence of space and time. It also implies, according to Kant, that space and time don’t exist outside of our minds!  Consider that pure intuitions exist equally in all possible perceptions and thus are independent of the actual properties of external objects. This independence suggests that rather than being objective features of the external world, space and time are structural features of our minds that frame all our experiences.

This is why Kant’s philosophy is a species of idealism. Space and time get turned into features of the mind, and correspondingly appearances in space and time become internal as well. Kant forcefully argues that this view does not make space and time into illusions, saying that without his doctrine “it would be absolutely impossible to determine whether the intuitions of space and time, which we borrow from no experience, but which still lie in our representation a priori, are not mere phantasms of our brain.”

The pure intuitions of space and time play an important role in Kant’s philosophy of mathematics: they serve to justify the synthetic a priori status of geometry and arithmetic. When we judge that the sum of the interior angles of a triangle is 180º, for example, we do so not purely by examining the concepts triangle, sum, and angle. We also need to consult the pure intuition of space! And similarly, our affirmations of arithmetic truths rely upon the pure intuition of time for their validity.

Pure Concepts

Pure intuition is only one part of the story. We don’t just perceive the world, we also think about our perceptions. In Kant’s words, “Thoughts without content are empty; intuitions without concepts are blind. […] The understanding cannot intuit anything, and the senses cannot think anything. Only from their union can cognition arise.” As pure intuitions are to perceptions, pure concepts are to thought. Pure concepts are necessary for our empirical judgments, and without them we could not make sense of perception. It is this category in which causality falls.

Kant’s argument for this is that causality is a necessary condition for the judgment that events occur in a temporal order. He starts by observing that we don’t directly perceive time. For instance, we never have a perception of one event being before another, we just perceive one and, separately, the other. So to conclude that the first preceded the second requires something beyond perception, that is, a concept connecting them.

He argues that this connection must be necessary: “For this objective relation to be cognized as determinate, the relation between the two states must be thought as being such that it determines as necessary which of the states must be placed before and which after.” And as we’ve seen, the only way to get a necessary connection between perceptions is through a pure concept. The required pure concept is the relation of cause and effect: “the cause is what determines the effect in time, and determines it as the consequence.” So starting from the observation that we judge events to occur in a temporal order, Kant concludes that we must have a pure concept of cause and effect.

What about particular causal rules, like that striking a match produces a flame? Such rules are not derived solely from experience, but also from the pure concept of causality, on which their existence depends. It is the presence of the pure concept that allows the inference of these particular rules from experience, even though they postulate a necessary connection.

Now we can see how different Kant and Hume’s conceptions of causality are. While Hume thought that the traditional concept of causality as a necessary connection was unrescuable and extraneous to our perceptions, Kant sees it as a bedrock principle of experience that is necessary for us to be able to make sense of our perceptions at all. Kant rejects Hume’s definition of cause in terms of constant conjunction on the grounds that it “cannot be reconciled with the scientific a priori cognitions that we actually have.”

Despite this great gulf between the two philosophers’ conceptions of causality, there are some similarities. As we saw above, Kant agrees wholeheartedly with Hume that perception alone is insufficient for concluding that there is a necessary connection between events. He also agrees that a purely analytic approach is insufficient. Since Kant sees pure intuitions and pure concepts as features of the mind, not the external world, both philosophers deny that causation is an objective relationship between things in themselves (as opposed to perceptions of things). Of course, Kant would deny that this makes causality an illusion, just as he denied that space and time are made illusory by his philosophy.

Of course, it’s impossible to know to what extent the two philosophers would have actually agreed, had Hume been able to read Kant’s responses to his works. Would he have been convinced that synthetic a priori judgments really exist? If so, would he accept Kant’s pure intuitions and pure concepts? I suspect that at the crux of their disagreement would be Kant’s claim that math is synthetic a priori. While Hume never explicitly proclaims math’s analyticity (he didn’t have the term, after all), it seems more in line with his views on algebra and arithmetic as purely concerning the way that ideas relate to one another. It is also more in line with the axiomatic approach to mathematics familiar to Hume, in which one defines a set of axioms from which all truths about the mathematical concepts involved necessarily follow.

If Hume did maintain math’s analyticity, then Kant’s arguments about the importance of synthetic a priori knowledge would probably hold much less sway for him, and would largely amount to an appeal to the validity of metaphysical knowledge, which Hume already doubted. Hume also would likely want to resist Kant’s idealism; in Section XII of the Enquiry he mocks philosophers that doubt the connection between the objects of our senses and external objects, saying that if you “deprive matter of all its intelligible qualities, both primary and secondary, you in a manner annihilate it and leave only a certain unknown, inexplicable something as the cause of our perceptions – a notion so imperfect that no skeptic will think it worthwhile to contend against it.”