The problem with philosophy

(Epistemic status: I have a high credence that I’m going to disagree with large parts of this in the future, but it all seems right to me at present. I know that’s non-Bayesian, but it’s still true.)

Philosophy is great. Some of the clearest thinkers and most rational people I know come out of philosophy, and many of my biggest worldview-changing moments have come directly from philosophers. So why is it that so many scientists seem to feel contempt towards philosophers and condescension towards their intellectual domain? I can actually occasionally relate to the irritation, and I think I understand where some of it comes from.

Every so often, a domain of thought within philosophy breaks off from the rest of philosophy and enters the sciences. Usually when this occurs, the subfield (which had previously been stagnant and unsuccessful in its attempts to make progress) is swiftly revolutionized and most of the previous problems in the field are promptly solved.

Unfortunately, what also often happens is that the philosophers that were previously working in the field are often unaware of or ignore the change in their field, and end up wasting a lot of time and looking pretty silly. Sometimes they even explicitly challenge the scientists at the forefront of this revolution, like Henri Bergson did with Einstein after he came out with his pesky new theory of time that swept away much of the past work of philosophers in one fell swoop.

Next you get a generation of philosophy students that are taught a bunch of obsolete theories, and they are later blindsided when they encounter scientists that inform them that the problems they’re working on have been solved decades ago. And by this point the scientists have left the philosophers so far in the dust that the typical philosophy student is incapable of understanding the answers to their questions without learning a whole new area of math or something. Thus usually the philosophers just keep on their merry way, asking each other increasingly abstruse questions and working harder and harder to justify their own intellectual efforts. Meanwhile scientists move further and further beyond them, occasionally dropping in to laugh at their colleagues that are stuck back in the Middle Ages.

Part of why this happens is structural. Philosophy is the womb inside which develops the seeds of great revolutions of knowledge. It is where ideas germinate and turn from vague intuitions and hotbeds of conceptual confusion into precisely answerable questions. And once these questions are answerable, the scientists and mathematicians sweep in and make short work of them, finishing the job that philosophy started.

I think that one area in which this has happened is causality.

Statisticians now know how to model causal relationships, how to distinguish them from mere regularities, how to deal with common causes and causal pre-emption, how to assess counterfactuals and assign precise probabilities to these statements, and how to compare different causal models and determine which is most likely to be true.

(By the way, guess where I came to be aware of all of this? It wasn’t in the metaphysics class in which we spent over a month discussing the philosophy of causation. No, it was a statistician friend of mine who showed me a book by Judea Pearl and encouraged me to get up to date with modern methods of causal modeling.)

Causality as a subject has firmly and fully left the domain of philosophy. We now have a fully fleshed out framework of causal reasoning that is capable of answering all of the ancient philosophical questions and more. This is not to say that there is no more work to be done on understanding causality… just that this work is not going to be done by philosophers. It is going to be done by statisticians, computer scientists, and physicists.

Another area besides causality where I think this has happened is epistemology. Modern advances in epistemology are not coming out of the philosophy departments. They’re coming out of machine learning institutes and artificial intelligence researchers, who are working on turning the question of “how do we optimally come to justified beliefs in a posteriori matters?” into precise code-able algorithms.

I’m thinking about doing a series of posts called “X for philosophers”, in which I take an area of inquiry that has historically been the domain of philosophy, and explain how modern scientific methods have solved or are solving the central questions in this area.

For instance, here’s a brief guide to how to translate all the standard types of causal statements philosophers have debated for centuries into simple algebra problems:

Causal model

An ordered triple of exogenous variables, endogenous variables, and structural equations for each endogenous variable

Causal diagram

A directed acyclic graph representing a causal model, whose nodes represent the endogenous variables and whose edges represent the structural equations

Causal relationship

A directed edge in a causal diagram

Causal intervention

A mutilated causal diagram in which the edges between the intervened node and all its parent nodes are removed

Probability of A if B

P(A | B)

Probability of A if we intervene on B

P(A | do B) = P(AB)

Probability that A would have happened, had B happened

P(AB | -B)

Probability that B is a necessary cause of A

P(-A-B | A, B)

Probability that B is a sufficient cause of A

P(AB | -A, -B)

Right there is the guide to understanding the nature of causal relationships, and assessing the precise probabilities of causal conditional statements, counterfactual statements, and statements of necessary and sufficient causation.

To most philosophy students and professors, what I’ve written is probably chicken-scratch. But it is crucially important for them in order to not become obsolete in their causal thinking.

There’s an unhealthy tendency amongst some philosophers to, when presented with such chicken-scratch, dismiss it as not being philosophical enough and then go back to reading David Lewis’s arguments for the existence of possible worlds. It is this that, I think, is a large part of the scientist’s tendency to dismiss philosophers as outdated and intellectually behind the times. And it’s hard not to agree with them when you’ve seen both the crystal-clear beauty of formal causal modeling, and also the debates over things like how to evaluate the actual “distance” between possible worlds.

Artificial intelligence researcher extraordinaire Stuart Russell has said that he knew immediately upon reading Pearl’s book on causal modeling that it was going to change the world. Philosophy professors should either teach graph theory and Bayesian networks, or they should not make a pretense of teaching causality at all.

Galileo and the Schelling point improbability principle

An alternative history interaction between Galileo and his famous statistician friend


In the year 1609, when Galileo Galilei finished the construction of his majestic artificial eye, the first place he turned his gaze was the glowing crescent moon. He reveled in the crevices and mountains he saw, knowing that he was the first man alive to see such a sight, and his mind expanded as he saw the folly of the science of his day and wondered what else we might be wrong about.

For days he was glued to his telescope, gazing at the Heavens. He saw the planets become colorful expressive spheres and reveal tiny orbiting companions, and observed the distant supernova which Kepler had seen blinking into existence only five years prior. He discovered that Venus had phases like the Moon, that some apparently single stars revealed themselves to be binaries when magnified, and that there were dense star clusters scattered through the sky. All this he recorded in frantic enthusiastic writing, putting out sentences filled with novel discoveries nearly every time he turned his telescope in a new direction. The universe had opened itself up to him, revealing all its secrets to be uncovered by his ravenous intellect.

It took him two weeks to pull himself away from his study room for long enough to notify his friend Bertolfo Eamadin of his breakthrough. Eamadin was a renowned scholar, having pioneered at age 15 his mathematical theory of uncertainty and created the science of probability. Galileo often sought him out to discuss puzzles of chance and randomness, and this time was no exception. He had noticed a remarkable confluence of three stars that were in perfect alignment, and needed the counsel of his friend to sort out his thoughts.

Eamadin arrived at the home of Galileo half-dressed and disheveled, obviously having leapt from his bed and rushed over immediately upon receiving Galileo’s correspondence. He practically shoved Galileo out from his viewing seat and took his place, eyes glued with fascination on the sky.

Galileo allowed his friend to observe unmolested for a half-hour, listening with growing impatience to the ‘oohs’ and ‘aahs’ being emitted as the telescope swung wildly from one part of the sky to another. Finally, he interrupted.

Galileo: “Look, friend, at the pattern I have called you here to discuss.”

Galileo swiveled the telescope carefully to the position he had marked out earlier.

Eamadin: “Yes, I see it, just as you said. The three stars form a seemingly perfect line, each of the two outer ones equidistant from the central star.”

Galileo: “Now tell me, Eamadin, what are the chances of observing such a coincidence? One in a million? A billion?”

Eamadin frowned and shook his head. “It’s certainly a beautiful pattern, Galileo, but I don’t see what good a statistician like myself can do for you. What is there to be explained? With so many stars in the sky, of course you would chance upon some patterns that look pretty.”

Galileo: “Perhaps it seems only an attractive configuration of stars spewed randomly across the sky. I thought the same myself. But the symmetry seemed too perfect. I decided to carefully measure the central angle, as well as the angular distance distended by the paths from each outer star to the central one. Look.”

Galileo pulled out a sheet of paper that had been densely scribbled upon. “My calculations revealed the central angle to be precisely 180.000º, with an error of ± .003º. And similarly, I found the difference in the two angular distances to be .000º, with a margin of error of ± .002º.”

Eamadin: “Let me look at your notes.”

Galileo handed over the sheets to Eamadin. “I checked over my calculations a dozen times before writing you. I found the angular distances by approaching and retreating from this thin paper, which I placed between the three stars and me. I found the distance at which the thin paper just happened to cover both stars on one extreme simultaneously, and did the same for the two stars on the other extreme. The distance was precisely the same, leaving measurement error only for the thickness of the paper, my distance from it, and the resolution of my vision.”

Eamadin: “I see, I see. Yes, what you have found is a startlingly clear pattern. A similarity in distance and precision of angle this precise is quite unlikely to be the result of any natural phenomenon… ”

Galileo: “Exactly what I thought at first! But then I thought about the vast quantity of stars in the sky, and the vast number of ways of arranging them into groups of three, and wondered if perhaps in fact such coincidences might be expected. I tried to apply your method of uncertainty to the problem, and came to the conclusion that the chance of such a pattern having occurred through random chance is one in a thousand million! I must confess, however, that at several points in the calculation I found myself confronted with doubt about how to progress and wished for your counsel.”

Eamadin stared at Galileo’s notes, then pulled out a pad of his own and began scribbling intensely. Eventually, he spoke. “Yes, your calculations are correct. The chance of such a pattern having occurred to within the degree of measurement error you have specified by random forces is 10-9.”

Galileo: “Aha! Remarkable. So what does this mean? What strange forces have conspired to place the stars in such a pattern? And, most significantly, why?”

Eamadin: “Hold it there, Galileo. It is not reasonable to jump from the knowledge that the chance of an event is remarkably small to the conclusion that it demands a novel explanation.”

Galileo: “How so?”

Eamadin: “I’ll show you by means of a thought experiment. Suppose that we found that instead of the angle being 180.000º with an experimental error of .003º, it was 180.001º with the same error. The probability of this outcome would be the same as the outcome we found – one in a thousand million.”

Galileo: “That can’t be right. Surely it’s less likely to find a perfectly straight line than a merely nearly perfectly straight line.”

Eamadin: “While that is true, it is also true that the exact calculation you did for 180.000º ± .003º would apply for 180.001º ± .003º. And indeed, it is less likely to find the stars at this precise angle, than it is to find the stars merely near this angle. We must compare like with like, and when we do so we find that 180.000º is no more likely than any other angle!”

Galileo: “I see your reasoning, Eamadin, but you are missing something of importance. Surely there is something objectively more significant about finding an exactly straight line than about a nearly straight line, even if they have the same probability. Not all equiprobable events should be considered to be equally important. Think, for instance, of a sequence of twenty coin tosses. While it’s true that the outcome HHTHTTTTHTHHHTHHHTTH has the same probability as the outcome HHHHHHHHHHHHHHHHHHHH, the second is clearly more remarkable than the first.”

Eamadin: “But what is significance if disentangled from probability? I insist that the concept of significance only makes sense in the context of my theory of uncertainty. Significant results are those that either have a low probability or have a low conditional probability given a set of plausible hypotheses. It is this second class that we may utilize in analyzing your coin tossing example, Galileo. The two strings of tosses you mention are only significant to different degrees in that the second more naturally lends itself to a set of hypotheses in which the coin is heavily biased towards heads. In judging the second to be a more significant result than the first, you are really just saying that you use a natural hypothesis class in which probability judgments are only dependent on the ratios of heads and tails, not the particular sequence of heads and tails. Now, my question for you is: since 180.000º is just as likely as 180.001º, what set of hypotheses are you considering in which the first is much less likely than the second?”

Galileo: “I must confess, I have difficulty answering your question. For while there is a simple sense in which the number of heads and tails is a product of a coin’s bias, it is less clear what would be the analogous ‘bias’ in angles and distances between stars that should make straight lines and equal distances less likely than any others. I must say, Eamadin, that in calling you here, I find myself even more confused than when I began!”

Eamadin: “I apologize, my friend. But now let me attempt to disentangle this mess and provide a guiding light towards a solution to your problem.”

Galileo: “Please.”

Eamadin: “Perhaps we may find some objective sense in which a straight line or the equality of two quantities is a simpler mathematical pattern than a nearly straight line or two nearly equal quantities. But even if so, this will only be a help to us insofar as we have a presumption in favor of less simple patterns inhering in Nature.”

Galileo: “This is no help at all! For surely the principle of Ockham should push us towards favoring more simple patterns.”

Eamadin: “Precisely. So if we are not to look for an objective basis for the improbability of simple and elegant patterns, then we must look towards the subjective. Here we may find our answer. Suppose I were to scribble down on a sheet of paper a series of symbols and shapes, hidden from your view. Now imagine that I hand the images to you, and you go off to some unexplored land. You explore the region and draw up cartographic depictions of the land, having never seen my images. It would be quite a remarkable surprise were you to find upon looking at my images that they precisely matched your maps of the land.”

Galileo: “Indeed it would be. It would also quickly lend itself to a number of possible explanations. Firstly, it may be that you were previously aware of the layout of the land, and drew your pictures intentionally to capture the layout of the land – that is, that the layout directly caused the resemblance in your depictions. Secondly, it could be that there was a common cause between the resemblance and the layout; perhaps, for instance, the patterns that most naturally come to the mind are those that resemble common geographic features. And thirdly, included only for completion, it could be that your images somehow caused the land to have the geographic features that it did.”

Eamadin: “Exactly! You catch on quickly. Now, this case of the curious coincidence of depiction and reality is exactly analogous to your problem of the straight line in the sky. The straight lines and equal distances are just like patterns on the slips of paper I handed to you. For whatever reason, we come pre-loaded with a set of sensitivities to certain visual patterns. And what’s remarkable about your observation of the three stars is that a feature of the natural world happens to precisely align with these patterns, where we would expect no such coincidence to occur!”

Galileo: “Yes, yes, I see. You are saying that the improbability doesn’t come from any objective unusual-ness of straight lines or equal distances. Instead, the improbability comes from the fact that the patterns in reality just happen to be the same as the patterns in my head!”

Eamadin: “Precisely. Now we can break down the suitable explanations, just as you did with my cartographic example. The first explanation is that the patterns in your mind were caused by the patterns in the sky. That is, for some reason the fact that these stars were aligned in this particular way caused you to by psychologically sensitive to straight lines and equal quantities.”

Galileo: “We may discard this explanation immediately, for such sensitivities are too universal and primitive to be the result of a configuration of stars that has only just now made itself apparent to me.”

Eamadin: “Agreed. Next we have a common cause explanation. For instance, perhaps our mind is naturally sensitive to visual patterns like straight lines because such patterns tend to commonly arise in Nature. This natural sensitivity is what feels to us on the inside as simplicity. In this case, you would expect it to be more likely for you to observe simple patterns than might be naively thought.”

Galileo: “We must deny this explanation as well, it seems to me. For the resemblance to a straight line goes much further than my visual resolution could even make out. The increased likelihood of observing a straight line could hardly be enough to outweigh our initial naïve calculation of the probability being 10-9. But thinking more about this line of reasoning, it strikes me that you have just provided an explanation the apparent simplicity of the laws of Nature! We have developed to be especially sensitive to patterns that are common in Nature, we interpret such patterns as ‘simple’, and thus it is a tautology that we will observe Nature to be full of simple patterns.”

Eamadin: “Indeed, I have offered just such an explanation. But it is an unsatisfactory explanation, insofar as one is opposed to the notion of simplicity as a purely subjective feature. Most people, myself included, would strongly suggest that a straight line is inherently simpler than a curvy line.”

Galileo: “I feel the same temptation. Of course, justifying a measure of simplicity that does the job we want of it is easier said than done. Now, on to the third explanation: that my sensitivity to straight lines has caused the apparent resemblance to a straight line. There are two interpretations of this. The first is that the stars are not actually in a straight line, and you only think this because of your predisposition towards identifying straight lines. The second is that the stars aligned in a straight line because of these predispositions. I’m sure you agree that both can be reasonably excluded.”

Eamadin: “Indeed. Although it may look like we’ve excluded all possible explanations, notice that we only considered one possible form of the common cause explanation. The other two categories of explanations seem more thoroughly ruled out; your dispositions couldn’t be caused by the star alignment given that you have only just found out about it and the star alignment couldn’t be caused by your dispositions given the physical distance.”

Galileo: “Agreed. Here is another common cause explanation: God, who crafted the patterns we see in Nature, also created humans to have similar mental features to Himself. These mental features include aesthetic preferences for simple patterns. Thus God causes both the salience of the line pattern to humans and the existence of the line pattern in Nature.”

Eamadin: “The problem with this is that it explains too much. Based solely on this argument, we would expect that when looking up at the sky, we should see it entirely populated by simple and aesthetic arrangements of stars. Instead it looks mostly random and scattershot, with a few striking exceptions like those which you have pointed out.”

Galileo: “Your point is well taken. All I can imagine now is that there must be some sort of ethereal force that links some stars together, gradually pushing them so that they end up in nearly straight lines.”

Eamadin: “Perhaps that will be the final answer in the end. Or perhaps we will discover that it is the whim of a capricious Creator with an unusual habit for placing unsolvable mysteries in our paths. I sometimes feel this way myself.”

Galileo: “I confess, I have felt the same at times. Well, Eamadin, although we have failed to find a satisfactory explanation for the moment, I feel much less confused about this matter. I must say, I find this method of reasoning by noticing similarities between features of our mind and features of the world quite intriguing. Have you a name for it?”

Eamadin: “In fact, I just thought of it on the spot! I suppose that it is quite generalizable… We come pre-loaded with a set of very salient and intuitive concepts, be they geometric, temporal, or logical. We should be surprised to find these concepts instantiated in the world, unless we know of some causal connection between the patterns in our mind and the patterns in reality. And by Eamadin’s rule of probability-updating, when we notice these similarities, we should increase our strength of belief in these possible causal connections. In the spirit of anachrony, let us refer to this as the Schelling point improbability principle!”

Galileo: “Sounds good to me! Thank you for your assistance, my friend. And now I must return to my exploration of the Cosmos.”

Akaike, epicycles, and falsifiability

I found a nice example of an application of model selection techniques in this paper.

The history of astronomy provides one of the earliest examples of the problem at hand. In Ptolemy’s geocentric astronomy, the relative motion of the earth and the sun is independently replicated within the model for each planet, thereby unnecessarily adding to the number of adjustable parameters in his system. Copernicus’s major innovation was to decompose the apparent motion of the planets into their individual motions around the sun together with a common sun-earth component, thereby reducing the number of adjustable parameters. At the end of the non-technical exposition of his programme in De Revolutionibus, Copernicus repeatedly traces the weakness of Ptolemy’s astronomy back to its failure to impose any principled constraints on the separate planetary models.

In a now famous passage, Kuhn claims that the unification or harmony of Copernicus’ system appeals to an aesthetic sense, and that alone. Many philosophers of science have resisted Kuhn’s analysis, but none has made a convincing reply. We present the maximization of estimated predictive accuracy as the rationale for accepting the Copernican model over its Ptolemaic rival. For example, if each additional epicycle is characterized by 4 adjustable parameters, then the likelihood of the best basic Ptolemaic model, with just twelve circles, would have to be e20 (or more than 485 million) times the likelihood of its Copernican counterpart with just seven circles for the evidence to favour the Ptolemaic proposal. Yet it is generally agreed that these basic models had about the same degree of fit with the data known at the time. The advantage of the Copernican model can hardly be characterized as merely aesthetic; it is observation, not a prioristic preference, that drives our choice of theory in this instance.

How to Tell when Simpler, More Unified, or Less Ad Hoc Theories will Provide More Accurate Predictions

Looking into this a little, I found on Wiki that apparently more and more complicated epicycle models were developed after Ptolemy.

As a measure of complexity, the number of circles is given as 80 for Ptolemy, versus a mere 34 for Copernicus. The highest number appeared in the Encyclopædia Britannica on Astronomy during the 1960s, in a discussion of King Alfonso X of Castile’s interest in astronomy during the 13th century. (Alfonso is credited with commissioning the Alfonsine Tables.)

By this time each planet had been provided with from 40 to 60 epicycles to represent after a fashion its complex movement among the stars. Amazed at the difficulty of the project, Alfonso is credited with the remark that had he been present at the Creation he might have given excellent advice.

40 epicycles per planet, with five known planets in Ptolemy’s time, and four adjustable parameters per epicycle, gives 800 additional parameters.

Since AIC scores are given by (# of parameters) – (log of likelihood of evidence), we can write:

AICCopernicus = kCopernicus – LCopernicus
AICepicycles = (kCopernicus + 800) – Lepicycles

AICepicycles > AICCopernicus only if Lepicycles / LCopernicus > e800

For these two models to perform equally well according to AIC, the strength of the evidence for epicycles would have to be at least e800 times stronger than the strength of the evidence for Copernicus. This corresponds roughly to a 2 with 347 zeroes after it. This is a much clearer argument for the superiority of heliocentrism over geocentrism than a vague appeal to lower priors in the latter than the former.

I like this as a nice simple example of how AIC can be practically applied. It’s also interesting to see how the type of reasoning formalized by AIC is fairly intuitive, and that even scholars in the 1500s were thinking in terms of excessive model flexibility in terms of abundant parameters as an epistemic failing.

Another example given in the same paper is Newton’s notion of admitting only as many causes as are necessary to explain the data. This is nicely formalized in terms of AIC using causal diagrams; if a model of a variable references more causes of that variable, then that model involves more adjustable parameters. In addition, adding causal dependencies to a causal model adds parameters to the description of the system as a whole.

One way to think about all this is that AIC and other model selection techniques provide a protection against unfalsifiability. A theory with too many tweakable parameters can be adjusted to fit a very wide range of data points, and therefore is harder to find evidence against.

I recall a discussion between two physicists somewhere about whether Newton’s famous equation F = ma counts as an unfalsifiable theory. The idea is just that for basically any interaction between particles, you could find some function F that makes the equation true. This has the effect of making the statement fairly vacuous, and carrying little content.

What does AIC have to say about this? The family of functions represented by F = ma is:

= { F = ma : F any function of the coordinates of the system }

How many parameters does this model have? Well, the ‘tweakable parameter’ lives inside an infinite dimensional Hilbert space of functions, suggesting that the number of parameters is infinity! If this is right, then the overfitting penalty on Newton’s second law is infinitely large and should outweigh any amount of evidence that could support it. This is actually not too crazy; if a model can accommodate any data set, then the model is infinitely weak.

One possible response is that the equation F = ma is meant to be a definitional statement, rather than a claim about the laws of physics. This seems wrong to me for several reasons, the most important of which is that it is not the case that any set of laws of physics can be framed in terms of Newton’s equation.

Case in point: quantum mechanics. Try as you might, you won’t be able to express quantum happenings as the result of forces causing accelerations according to F = ma. This suggests that F = ma is at least somewhat of a contingent statement, one that is meant to model aspects of reality rather than simply define terms.

Value beyond ethics

There is a certain type of value in our existence that transcends ethical value. It is beautifully captured in this quote from Richard Feynman:

It is a great adventure to contemplate the universe, beyond man, to contemplate what it would be like without man, as it was in a great part of its long history and as it is in a great majority of places. When this objective view is finally attained, and the mystery and majesty of matter are fully appreciated, to then turn the objective eye back on man viewed as matter, to view life as part of this universal mystery of greatest depth, is to sense an experience which is very rare, and very exciting. It usually ends in laughter and a delight in the futility of trying to understand what this atom in the universe is, this thing—atoms with curiosity—that looks at itself and wonders why it wonders.

Well, these scientific views end in awe and mystery, lost at the edge in uncertainty, but they appear to be so deep and so impressive that the theory that it is all arranged as a stage for God to watch man’s struggle for good and evil seems inadequate.

The Meaning Of It All

Carl Sagan beautifully expressed the same sentiment.

We are the local embodiment of a Cosmos grown to self-awareness. We have begun to contemplate our origins: starstuff pondering the stars; organized assemblages of ten billion billion billion atoms considering the evolution of atoms; tracing the long journey by which, here at least, consciousness arose. Our loyalties are to the species and the planet. We speak for Earth. Our obligation to survive is owed not just to ourselves but also to that Cosmos, ancient and vast, from which we spring.


The ideas expressed in these quotes feels a thousand times deeper and more profound than anything offered in ethics. Trolley problems seem trivial by comparison. If somebody argued that the universe would be better off without us on the basis of, say, a utilitarian calculation of net happiness, I would feel like there is an entire dimension of value that they are completely missing out on. This type of value, a type of raw aesthetic sense of the profound strangeness and beauty of reality, is tremendously subtle and easily slips out of grasp, but is crucially important. My blog header serves as a reminder: We are atoms contemplating atoms.

Two-factor markets for sex

A two-factor market is essentially just a market is one in which two parties must seek each other out to make trades. Say I want to make a new website-hosting platform to compete against WordPress. Well, just making the platform isn’t enough. Even if I create a platform that is objectively better than WordPress for both readers and creators, neither side will spontaneously start using the platform unless they think that the other side will as well.

A content creator has little incentive to move over from WordPress to my platform, because there are no readers there. And readers have little incentive to check out my platform, because there are no content creators using it. In other words, there exists a signaling equilibrium around WordPress as places for finding and creating online content. Bloggers come to WordPress because they know that it is a good place to find lots of readers, and readers come to WordPress because they know it is a good place to find lots of blogs.

This is a natural result of a two-factor market, and can result in some unfortunate suboptimalities. For instance, I’ve already suggested that an objectively better website-hosting platform might never become widely utilized, because of the nature of this equilibrium. A company like WordPress can exploit this by not investing as heavily in the quality of their product as they would have if the market was perfectly competitive.

Sexual selection looks like it has some of these features. If female birds on average favor a certain streak of red on the head of the males of the species, then we should expect that both this streak of red and the favoring of this streak will increase over time. Once streak-of-red has become a dominant sexually-selected-for trait, it is much harder for streak-of-green to gain prominence in the population. For this to happen, it requires not just a male with a streak of green, but a female that finds this attractive; i.e. the market for sex is a two-factor market. In the end, this trait will only gain prominence if it can beat out the existing red-streak equilibrium.

This two-factor market is coupled to a feedback loop that can further entrench these resulting equilibria. This is reflected in the fact that the products of the “exchanges” in this market are more red-streaked birds and red-streak-favoring birds. This would be as if Craigslist exchanges spawned new human buyers and sellers that would flock to Craigslist. In general, males in a species are attracted to females in that species that have certain specific traits, and females seek out males with certain traits. This results in equilibria in a sexually dimorphous population where both sexes have distinctive stable traits that they find attractive in each other.

In addition, this equilibrium is made more stable by the feedback nature of the market – the fact that the children resulting from the pairing of individuals with given traits are more likely to have those traits. Since the population is stuck in this stable equilibrium, it may prove resistant to change, even when that change would be a net gain in average fitness for the individuals in that population. So, for instance, if there exists a strong enough equilibrium around courtship practices in a certain species of bird, then these courtship practices may exist long past the point where there is any resemblance between the practice and any credible signal of evolutionary fitness.

Some possible examples of this might be the enormous antlers of Irish elk and the majestic tails of peacocks. What sort of evolutionary explanation could justify such opulence and apparent squandering of metabolic resources? Costly signaling is a standard explanation, the idea being that the enormous apparent waste of resources is a way of providing a credible signal to mates of their survival fitness. It’s like saying “if I’m able to waste all of these resources and still be doing fine, then you know that I’m more fit than somebody that’s doing just as well without wasting resources.” Think about an expert chess player playing you without one of his knights, and still managing to beat you, versus an expert chess player that beats you without a handicap. If an organism is sufficiently high-fitness, then handicapping itself can be beneficial as a way of signaling its high fitness over other high fitness individuals.

Even in this explanation, the precise details of how the elk or peacock spend their excess resources are irrelevant. Why is the elk’s energy going to producing enormous antlers, as opposed to any other burdensome bodily structure? The right answer to this may be that there is no real answer – it’s just the result of the type of runaway feedback cycle I’ve described above. What’s surprising and interesting to me is the idea that explanations like costly signaling don’t seem to be needed to explain sexual selection of seemingly arbitrary and wasteful traits; if the argument above is correct, then this would be predicted to happen all on its own.

Against falsifiability

What if time suddenly stopped everywhere for 5 seconds?

Your first instinct might be to laugh at the question and write it off as meaningless, given that such questions are by their nature unfalsifiable. I think this is a mistaken impulse, and that we can in general have justified beliefs about such questions. Doing so requires moving beyond outdated philosophies of science, and exploring the nature of evidence and probability. Let me present two thought experiments.

The Cyclic Universe

imagine that the universe evolves forward in time in such a way that at one time t1 its state is exactly identical to an earlier state at time t0. I mean exactly identical – the wave function of the universe at time t1 is quantitatively identical to the wave function at time t0.

By construction, we have two states of the universe that cannot be distinguished in any way whatsoever – no observation or measurement that you could make of the one will distinguish it from the other. And yet we still want to say that they are different from one another, in that one was earlier than the other.

But then we are allowing the universe to have a quantity (the ‘time-position’ of events) that is completely undetectable and makes no measurable difference in the universe. This should certainly make anybody that’s read a little Popper uneasy, and should call into question the notion that a question is meaningless if it refers to unfalsifiable events. But let’s leave this there for the moment and consider a stronger reason to take such questions seriously.

The Freezing Rooms

The point of this next thought experiment will be that we can be justified in our beliefs about unobservable and undetectable events. It’s a little subtler, but here we go.

Let’s imagine a bizarre building in which we have three rooms with an unusual property: each room seems to completely freeze at regular intervals. By everything I mean everything – a complete cessation of change in every part of the room, as if time has halted within.

Let’s further imagine that you are inside the building and can freely pass from one room to the other. From your observations, you conclude that Room 1 freezes every other day, Room 2 every fourth day, and Room 3 every third day. You also notice that when you are in any of the rooms, the other two rooms occasionally seem to suddenly “jump forward” in time by a day, exactly when you expect that your room would be frozen.

Room 1

Room 2 Room 3

So you construct this model of how these bizarre rooms work, and suddenly you come to a frightening conclusion – once every twelve days, all three rooms will be frozen at the same time! So no matter what room you are in, there will be a full day that passes without anybody noticing it in the building, and with no observable consequences in any of the rooms.

Sure, you can just step outside the building and observe it for yourself. But let’s expand our thought experiment: instead of a building with three rooms, let’s imagine that the entire universe is partitioned into three regions of space, in which the same strange temporal features exist. You can go from one region of the universe to another, allowing you to construct an equivalent model of how things work. And you will come to a justified belief that there are periods of time in which absolutely NOTHING is changing in the universe, and yet time is still passing.

Let’s just go a tiny bit further with this line of thought – imagine that suddenly somehow the other two rooms are destroyed (or the other two regions of space become causally disconnected in the extended case). Now the beings in one region will truly have no ability to do the experiments that allowed them to conclude that time is frozen on occasion in their own universe – and yet they are still justified in this belief. They are justified in the same way that somebody that observed a beam of light heading towards the event horizon of the universe is justified in continuing to believe in the existence of the beam of light, even thought it is entirely impossible to ‘catch up’ to the light and do an experiment that verifies that no, it hasn’t gone out of existence.

This thought experiment demonstrates that questions that refer to empirically indistinguishable states of the universe can be meaningful. This is a case that is not easy for Popperian falsifiability or old logical positivists to handle, but can be analyzed through the lens of modern epistemology.

Compare the following two theories of the time patterns of the building, where the brackets indicate a repeating pattern:

Theory 1
Room 1: [ ✓,  ]
Room 2: [ ✓, ✓, ✓, ]
Room 3: [ ✓, ✓, ]

Theory 2
Room 1: [ ✓, , ✓, , ✓, , ✓, , ✓, , ✓ ]
Room 2:  [ ✓, ✓, ✓, , ✓, ✓, ✓, ✗, ✓, ✓, ✓ ]
Room 3: [ ✓, ✓, ✗, ✓, ✓, ✗, ✓, ✓, ✗, ✓, ✓ ]

Notice that these two theories make all the same predictions about what everybody in each room will observe. But Theory 2 denies the existence of the the total freeze every 12 days, while Theory 1 accepts it.

Notice also that Theory 2 requires a much more complicated description to describe the pattern that it postulates. In Theory 1, you only need 9 bits to specify the pattern, and the days of total freeze are entailed as natural consequences of the pattern.

In Theory 2, you need 33 bits to be able to match the predictions of Theory 1 while also removing the total freeze!

Since observational evidence does not distinguish between these theories, this difference in complexity must be accounted for in the prior probabilities for Theory 1 and Theory 2, and would give us a rational reason to prefer Theory 1, even given the impossibility of falsification of Theory 2. This preference wouldn’t go away even in the limit of infinite evidence, and could in fact become stronger.

For instance, suppose that the difference in priors is proportional to the ratio of information required to specify the theory. In addition, suppose that all other theories of the universe that are empirically distinguishable from Theory 1 and Theory 2 starts with a total prior of 50%. If in the limit of infinite evidence we find that all other theories have been empirically ruled out, then we’ll see:

P(Theory 1) = 39.29%
P(Theory 2) = 10.71%
P(All else) = 50%

Infinite evidence limit
P(Theory 1) = 78.57%
P(Theory 2) =21.43%
P(All else) = 0%

The initial epistemic tax levied on Theory 2 due to its complexity has functionally doubled, as it is now two times less likely that Theory 1! Notice how careful probabilistic thinking does a great job of dealing with philosophical subtleties that are too much for obsolete frameworks of philosophy of science based on the concept of falsifiability. The powers of Bayesian reasoning are on full display here.

The spiritual and the scientific

There’s an Isaac Asimov quote that I love. It goes:

When people thought the Earth was flat, they were wrong. When people thought the Earth was spherical, they were wrong. But if you think that thinking the Earth is spherical is just as wrong as thinking the Earth is flat, then your view is wronger than both of them put together.

I was recently reminded of this because I’m at an ashram this week, and in one of the talks, a swami brought up his beef with science.

He talked about how science is just another form of faith, and that therefore our intuition is a perfectly valid guide to understanding the universe. After all, all of our past scientific theories have turned out to be wrong, so we should expect that our current theories will also turn out to be wrong.

Thus the Asimov.


For various reasons, I’m often in spiritual places surrounded by spiritual people. These are the types of people that say “I believe in all religions” and go to yoga retreats and read books about sacred healing and ancient wisdom. When I’m at these places, people sometimes find out that I’m a physics student who is interested in things like Science and Rationality. The types of responses I get are interesting.

Usually the people I talk to are enthusiastic and eager to talk about the most recent scientific discoveries they’ve heard of. They’re also quick to point out that Science can’t tell us everything, and after all there are the virtues of faith to be considered. Other times I feel a subtle shift in attitude. This might be paranoia, but it’s as I’ve been registered as somebody belonging to the Other Team.

And after all, important swamis declare that science is just another form of faith, and spiritual people nod knowingly. And the Deepak Chopras of the world declare with relish that science cannot tell us objective truths, and that scientists are arrogant and dogmatic.

This is all very weird to me. Science is our best systematized attempt to understand the world we live in and to unearth the general principles that guide this world. Great scientists are guided by a fascination with the order of the universe and wonder at its comprehensibility. At their root they want to understand, in Einstein’s words, the mind of God.

And the spiritual tell me that “spiritual” means something like “interested in pondering the nature of reality at a deep level and appreciating the awe-inspiring and profound aspects of existence.”

If this is how I should understand these terms, then spirituality and science are two things that should definitely definitely not be enemies. In fact, if “spiritual” meant what the spiritual claim it means, then the best spiritual seekers should be the same people as the best scientists.

Look at this quote from Carl Sagan:

Science is not only compatible with spirituality; it is a profound source of spirituality. When we recognize our place in an immensity of light years and in the passage of ages, when we grasp the intricacy, beauty and subtlety of life, then that soaring feeling, that sense of elation and humility combined, is surely spiritual.

And from Neil Degrasse Tyson:

It’s quite literally true that we are star dust, in the highest exalted way one can use that phrase. I bask in the majesty of the cosmos.

Not only are we in the universe, the universe is in us. I don’t know of any deeper spiritual feeling than what that brings upon me.

Are these not expressions of an utmost appreciation for the spiritual, as defined above? Why don’t the spiritual embrace Neil Degrasse Tyson and his scientific colleagues with open arms as fellow earnest truth-seekers, and marvel at the beauty of the universe together? I mean, just look at the man – he’s practically overflowing with the type of joy and curiosity that the spiritual should love!

The spiritual will tell me: “Yes, some of the greatest scientists are very spiritual. Look at Einstein! He said that science without religion is lame, and that all serious scientists recognize a Spirit in the laws of nature! Science at its best can be and should be a deeply spiritual enterprise. But unfortunately, a lot of scientists out there are just too close-minded. This is why the spiritual can sometimes sound anti-science, because the scientists of the world dogmatically reject our reasonable beliefs, like that the spiritually enlightened can read minds and make objects levitate, or that the stars are sending us secret messages about our romantic prospects and whether we should change jobs, or that playing cards thrown randomly onto the ground can accurately tell us our future!”

Yes, scientists can be dogmatic, because scientists are humans. But it strikes me that perhaps part of the reason that the spiritual might claim that scientists are especially dogmatic has maybe something to do with the fact that scientists have repeatedly studied and disproved common spiritual beliefs and practices. More importantly, many of these beliefs are in direct conflict with the known laws of nature. As the saying goes: keep your mind open – but not so open that your brains fall out.

The spiritual: “But science too often tries to go too far and dismiss those things which it doesn’t understand!”

What, like the possible physical effects that the stars could have on the paths that our lives take? Or like the effects of diluting a chemical compound until not a single molecule remains on the potency of the final product as a medical instrument? Or the ways that the lines on your palm form, that really really have nothing to do with how rich you’ll be or how many kids you’ll end up having?

No, this won’t do. Science does not understand everything. There are plenty of mysteries out there, and we love that there are. They give scientists employment! But scientists are certainly not in the business of blindly dismissing those things that they actually do not understand.

Besides, are scientists really all that dogmatic? Look at the history of the scientific worldview. Consensus theories are constantly recycled as we make the long march towards understanding reality. Some of the strongest scientific consensuses are only a few decades old! Scientists are constantly updating and refurnishing their view of reality as the evidence changes.

Perfectly? No! But I’d hazard a guess that they do so better than the average person. Why? For one thing, they have a career incentive to do so. A scientist that sticks to the old phlogiston-theory of combustion can’t get published, and a scientist that discovers damning evidence of the falsity of an important consensus gets tenure, pay raises, and respect from their colleagues. The incentive structure of science is set up to reward those that can avoid becoming stuck in dogmatic patterns of belief.


Physicist and philosopher Tim Maudlin described a feature of truth-seeking enterprises as that they tend to be uniform across space and to vary across time. Ask a biologist in Bengal what they think about the structure of DNA, and you’ll get pretty much the same answer as a biologist at Oxford. And when new evidence comes in, the beliefs of scientists shift fairly uniformly.

Ask a spiritual seeker in India what they think about Shamanic healing, and you’ll likely get a different answer from a spiritual seeker in the UK.

Yes, science has problems and is definitely not perfect. But we’re not comparing it to an ideal perfected version of science conducted by perfect Bayesian epistemologists with infinite computing power, we’re comparing it to humanity’s status quo. With rampant climate change denial, young Earth creationism, disbelief in evolution and anti-vaccination conspiracies, it’d be hard to convince me that scientists are much worse than the average Joe at avoiding patterns of dogmatic thought.

I just don’t buy that the high epistemic standards and regard for truth held by the spiritual is the reason that they dismiss science. I’ve met too many spiritual people eager to have their charts read by astrologers or obtain homeopathic sugar pills or communicate with invisible spirits. And I don’t buy that scientists are not actually honest truth-seekers trying to understand the world.

Which is why I think that the word spiritual doesn’t actually mean what the spiritual claim it means. I’m not being a linguistic prescriptivist here; I’m saying that the definition that spiritual people provide of spirituality is the motte, and the bailey is something else, something that is apparently hostile to science and friendly to all sorts of pseudoscientific ideas.

The bailey is where the fertile and valuable ideological land is, and the motte is the easily defensible position that spiritual people can retreat to when their beliefs are questioned. The bailey is not actually fundamentally about the urge to understand nature. It’s not actually about the same type of wonder and joy that a scientist gets when they understand some important piece of how the world works. Based off of many of the interactions I’ve had with self-identified spiritual people, I would define it as something like “belief in the existence of some phenomenon for which there is no evidence, or evidence against, like Reiki, crystal healing, tarot cards, etc.”


Looking at what I’ve written so far, it sounds like I see nothing but conflict between spirituality and science. This is not so. I have focused on the aspects of spirituality that do come into conflict with science, mostly because I think that these play a large role in the anti-scientific attitudes among the spiritual. The spiritual are quite friendly towards science when it supports their beliefs.

And it often does! There are spiritual practices that science has found to be genuinely beneficial, more than predicted by placebo effects, and beneficial in many of the ways that the spiritual claim them to be. Meditation and yoga come to mind. Mindfulness practices also have an impressive evidence base. And things like a belief in a higher power and spiritual experiences can be genuinely uplifting and transformative.

I’ve talked about spiritual people as if they were all the same, harboring irrational beliefs and anti-scientific attitudes. But plenty of spiritual people I meet are genuinely appreciative of the sciences, and want their world-view to be as fully supported by the scientific evidence as possible. Some are even scientists themselves!

And anti-science attitudes are not at all ubiquitous across spiritual traditions. Buddhism is often praised for its friendliness towards the sciences, and its scientific approach to belief formation. The Dalai Lama says things like:

If scientific analysis were conclusively to demonstrate certain claims in Buddhism to be false, then we must accept the findings of science and abandon those claims.

I don’t know enough about the Dalai Lama’s personal epistemic habits to be confident that this is more than nice-sounding words. How does he think that this attitude affects Buddhist views on karma and reincarnation, for instance?

It is much easier to proclaim a science-friendly attitude than it is to actually accept the tough implications of such an attitude on beliefs central to one’s ideology. But attitudes like this seem like the right way forward in reconciling the actual meaning of spirituality with the meaning that the spiritual seem to want it to have.

Comprehensibility of the Complex

(Some speculative rambling about stuff I’ve been thinking about recently.)

There’s a fallacy that I have committed hundreds of times, and that I have only really recently internalized as a fallacy. Perhaps it is not a fallacy, but a confused pattern of thought. In any case, I’ll call it “the incomprehensibility of the complex.”

Here’s the context in which I would make the mistake:

Somebody brings up some political or economic question, say “Should we have left Iraq?” or “Should we raise the minimum wage?”

This sparks a fierce debate. Somebody says that removing the troops left the region defenseless against takeover by extremist groups, or that extra wages given to workers go back into the economy and stimulate the economy. Another objects that our troops were ultimately the source of the instability, or cite the broken-window fallacy.

And I would think: “The world is crazily complicated. Physicists can barely understand complex atoms. Now scale that complexity up to interactions between hundreds of millions of humans, each one a system of a hundred trillion trillion atoms. This should put into perspective the proper degree of epistemic humility we should hold when discussing the minimum wage.”

Basically: If we can’t understand atoms, then we sure as hell can’t understand economic systems or international relations.

Observing that this is a bad argument is not too profound or interesting.

What’s interesting to me is the fact that this is a bad argument. That is, the fact that we can scale up the complexity of the system we are studying by a factor of 10^30, squint our eyes, and then get to work at creating fantastically simple and accurate models of the system. This is absolutely insane, and tells us something about the type of universe that we live in.


Recently I watched a lecture on Marginal Revolution University about gun buyback programs and slave redemption policies. The gist of it is this:

Starting in 1993, some humanitarian groups got in their head that they could save Sudanese slaves by buying them from their owners and then freeing them. This maybe sounds like a good idea, until you learn about supply and demand curves.

In truth, what the slave redeemers ended up doing was increasing demand for slaves, resulting in new slaves being captured and tens of thousands of dollars ending up in the hands of slave-owners. Fresh revenue funded weapons purchases, further enabling slave traders to raid villages and capture new slaves.

(By the way, some charity groups still do this)

A similar thing can happen with gun buyback programs. These programs involve the buying of guns in large quantities from gun owners in order to melt them down, the thought being that this will get the guns off of the street. The effect of this?

Well, the gun producers thank their new customers for the money and start manufacturing more guns to supply their larger customer base. In some cases violent crime rates jumped, and a study measuring if these programs actually decrease violent crime rates overall found no statistically significant effects.

Now, I’m ashamed to say that these programs actually initially seemed like fine ideas to me. This is really a statement of my failure to have internalized how supply and demand curves work. In my defense, this is not always a totally horrible policy idea. When demand is much more elastic than supply, the price of the good will jump and many of the original buyers will be priced out of the market. In other words, if the producers have a harder time scaling up their operations than the consumers have buying less of the good, then the world will actually end up freer of slaves/guns.

But that is not how these markets actually work. Demand for guns is in fact less elastic than supply of guns, so the gun nuts are barely affected and the ungun-nuts are handing over free money to the gun manufacturers.

Gun Buybacks

And one more example from Marginal Revolution. Sorry, but we’re on the topic of unintuitive basic econ and it’s just too good to leave out.

In 1990 the United States passed a policy that applied a tax on luxury goods like yachts. The idea, it seems, was, “The federal budget deficit is too high, and if we tax the rich on their fancy luxury goods, we can reduce the deficit without really hurting anybody.” Sounds good, yes?

But what actually happened was that as the price of yachts increased, rich people bought less, and thousands of laborers in the yacht industry lost their jobs. When all was said and done, the government ended up paying more in increased unemployment benefits than they gained in tax revenue from the policy! The government quickly wised up and repealed the tax a few years after it was put in place.

How to understand this? Easy! Draw a graph of supply and demand. Which one has a steeper slope? Well, rich people can fairly easily just spend their money differently if yacht prices increase. They care less about one less yacht than the workers that survive off of the wage they got making that yacht.

So the yacht-buyers will more easily leave the market than the yacht-producers, which means the demand for yachts is more elastic than the supply, which means that the producers are hurt more by the tax.

Luxury Tax

The point is, the model works! It makes weird-sounding and unintuitive predictions, and it turns out to be right. Literally just draw two lines and assess their relative slopes, and you can understand why a tax will sometimes burn consumers and other times burn producers. (You can also do better than the US government in 1990 apparently, but maybe this shouldn’t be surprising)

A simple model of our economy as a bunch of supply and demand curves with varying elasticities has enormous explanatory power. This is a breathtakingly simple model of a breathtakingly complex system. And it tells us something important about the world that it works at all.

Okay, enough fun with econ. All of this was just to say that I feel thoroughly rebutted in my old view that things like interactions of humans are too complex to be understood by anybody. So we have our mystery: how does simplicity arise out of complexity?

Here’s my attempt at an answer: simplicity arises when the universe is playing an optimization game with a simple target.

If every few seconds God scanned the universe, erased the least macroscopically circular shapes, and duplicated the rest, then you would quickly expect to be able the universe to consist of only circles. More to the point, it would quickly become possible to accurately model the universe as a bunch of circles of various sizes at various locations.

The clearest real world example of something like this is natural selection. Natural selection is a process that is optimizing biological systems for a simple target – reproductive fitness. It kills off variation and only lets those few forms that are able to reproduce successfully survive into the next generation.

In this sense, natural selection prunes down the complexity of the world, replacing the incomprehensible with the comprehensible. What was initially a high-entropy system, describable only at the level of fundamental physics, becomes a low-entropy system, describable by a few simple biological principles. Instead of having to describe the organism in full glorious detail at the level of quarks and electrons, we just need to explain how it won the optimization game of natural selection.

Gravity gives us another example of an optimization game our universe plays. Once you get enough mass in one place, gravity will crush it inward towards the center of mass, gradually inching diverse macroscopic shapes towards sphericity.


Which is why every large object you’ll see in the sky looks perfectly spherical. Any large objects that started off clunky and non-spherical were ruthlessly optimized into sphericity. (Actually they are oblate spheroids, but that’s because technically the optimization game they’re playing is gravity + angular momentum)

So why do supply and demand curves do a great job at predicting interactions between massive numbers of humans? The implied answer is that humans are the result of an optimization game that has made our behaviors simply describable in terms of supply and demand curves.

What exactly does this mean? Perhaps a trait that enhances reproductive fitness in organisms like us is the cognitive skill to make tradeoffs between different desires, and this gives rise to some type of universal comparison metric between very different goods. Now we can sensibly say things like “I want ice cream less than I want to enjoy a beautiful sunset. Except orange custard chocolate chip ice cream. I’d trade off the sunset for orange custard chocolate chip ice cream any day.”

Then somebody comes along with a bright idea called ‘money’, and suddenly we have a great generalization about human behavior: “Everybody wants more money.” From this, some basic notions like a downward-sloping demand curve, an upward-sloping supply curve, and a push towards equilibrium follow quite nicely. And we have a crazily simple high-level explanation of the crazily complex phenomenon of human interaction.