The more I think about timeless decision theory, the more it seems obviously correct to me.
The key idea is that sometimes there is a certain type of non-causal logical dependency (called a subjunctive dependence) between agents that must be taken into account by those agents in order to make rational decisions. The class of cases in which subjunctive dependences become relevant involve agents in environments that contain other agents trying to predict their actions, and also environments that contain other agents that are similar to them.
Here’s my favorite motivating thought experiment for TDT: Imagine that you encounter a perfect clone of yourself. You have lived identical lives, and are structurally identical in every way. Now you are placed in a prisoner’s dilemma together. Should you cooperate or defect?
A non-TDTist might see no good reason to cooperate – after all, defecting dominates cooperation as a strategy, and your decision doesn’t affect your clone’s decision. If the two of you share no common cause explanation for your similarity, then this conclusion is even stronger – both evidential and causal decision theory would defect. So both you and your clone defect and you both walk away unhappy.
TDT is just the admission that there is an additional non-causal dependence between your decision and your clone’s decision that must be taken into account. This dependence comes from the fact that you and your clone have a shared input-output structure. That is, no matter what you end up doing, you know that your clone must do the same thing, because your clone is operating identically to you.
In a deterministic world, it is logically impossible that you choose to do X and your clone does Y. The initial conditions are the same, so the final conditions must be the same. So you end up cooperating, as does your clone, and everybody walks away happy.
With an imperfect clone, it is no longer logically impossible, but there still exists a subjunctive dependence between your actions and your clone’s.
This is a natural and necessary modification to decision theory. We take into account not only the causal effects of our actions, but the evidential effects of our actions. Even if your action does not causally affect a given outcome, it might still make it more or less likely, and subjunctive dependence is one of the ways that this can happen.
TDTists interacting with each other would get along really nicely. They wouldn’t fall victim to coordination problems, because they wouldn’t see their decisions as isolated and disconnected from the decisions of the others. They wouldn’t undercut each other in bargaining problems in which one side gets to make the deals and the other can only accept or reject.
In general, they would behave in a lot of ways that are standardly depicted as irrational (like one-boxing in Newcomb’s problem and cooperating in the prisoner’s dilemma), and end up much better off as a result. Such a society seems potentially much nicer and subject to fewer of the common failure modes of standard decision theory.
In particular, in a society in which it is common knowledge that everybody is a perfect TDTist, there can be strong subjunctive dependencies between the actions causally disconnected agents. If a TDTist is considering whether or not to vote for their preferred candidate, they aren’t comparing outcomes that differ by a single vote. They are considering outcomes that differ by the size of the entire class of individuals that would be reasoning similar to them.
In simple enough cases, this could mean that your decision about whether to vote is really a decision about if millions of people will vote or not. This may sound weird, but it follows from the exact same type of reasoning as in the clone prisoner’s dilemma.
Imagine that the society consisted entirely of 10 million exact clones of you, each deciding whether or not to vote. In such a world, each individual’s choice is perfectly subjunctively dependent upon every other individual’s choice. If one of them decides not to vote, then all of them decide not to vote.
In a more general case, perfect clones of you don’t exist in your environment. But in any given context, there is still a large class of individuals that reason similarly to you as a result of a similar input-output process.
For example, all humans are very similar in certain ways. If I notice that my blood is red, and I had previously never heard about or seen the blood color of anybody else, then I should now strongly update on the redness of the blood of other humans. This is obviously not because my blood being red causes others to have red blood. It is also not because of a common cause – in principle, any such cause could be screened off, and we would expect the same dependence to exist solely in virtue of the similarity of structure. We would expect red blood in alien whose evolutionary history has been entirely causally separated from ours but who by a wild coincidence has the same DNA structure as humans.
Our similarities in structure can be less salient to us when we think about our minds and the way we make decisions, but they still are there. If you notice that you have a strong inclination to decide to take action X, then this actually does serve as evidence that a large class of other people will take action X. The size of this class and the strength of this evidence depends on which particular X is being analyzed.
Ethics and TDT
It is natural to wonder: what sort of ethical systems naturally arise out of TDT?
We can turn a decision theory into an ethical framework by choosing a utility function that encodes the values associated with that ethical framework. The utility function for hedonic utilitarianism assigns utility according to the total well-being in the universe. The utility function for egoism assigns utility only to your own happiness, apathetic to the well-being of others.
Virtue ethics and deontological ethics are harder to encode. We could do the first by assigning utility to virtuous character traits and disutility to vices. The second could potentially be achieved by assigning negative infinities to violations of the moral rules.
Let’s brush aside the fact that some of these assignments are less feasible than others. Pretend that your favorite ethical system has a nice neat way of being formalized as a utility function. Now, the distinctive feature of TDT-based ethics is that when we are trying to decide on the most ethical course of action, TDT says that we must imagine that our decision would also be taken by anybody else that is sufficiently similar to you in a sufficiently similar context.
In other words, in contemplating what the right action to take is, you imagine a world in which these actions are universalized! This sounds very Kantian. One of his more famous descriptions of the categorical imperative was:
Act only according to that maxim whereby you can at the same time will that it should become a universal law.
This could be a tagline for ethics in a world of TDTists! The maxim for your action resembles the notion of similarity in motivation and situational context that generates subjunctive dependence, and the categorical imperative is the demand that you must take into account this subjunctive dependence if you are to reason consistently.
But actually, I think that the resemblance between Kantian ethical reasoning and timeless decision theory begins to fade away when you look closer. I’ll list three main points of difference:
- Consistency vs expected utility
- Maxim vs subjunctive dependence
- Differences in application
1. Consistency vs expected utility
Kantian universalizability is not about expected utility, it is about consistency. The categorical imperative forbids acts that, when universalized, become self-undermining. If an act is consistently universalizable, then it is not a violation of the categorical imperative, even if it ends up with everybody in horrible misery.
Timeless decision theory looks at a world in which everybody acts according to the same maxim that you are acting under, and then asks whether this world looks nice or not. “Looks nice” refers to your utility function, not any notion of consistency or non-self-underminingness (not a word, I know).
So this is the first major difference: A timeless ethical theorist cares ultimately about optimizing their moral values, not about making sure that their values are consistently applicable in the limit of universal instantiation. This puts TDT-based ethics closer to a rule-based consequentialism than to Kantian ethics, although this comparison is also flawed.
Is this a bug or a feature of TDT?
I’m tempted to say it’s a feature. My favorite example of why Kantian consistency is not a desirable meta-ethical principle is that if everybody were to give to charity, then all the problems that could be solved by giving to charity would be solved, and the opportunity to give to charity would disappear. So the act of giving to charity becomes self-undermining upon universalization.
To which I think the right response is: “So what?”
If a world in which everybody gives to charity is a world in which there are no more problems to be solved by charity-giving, then that sounds pretty great to me. If this consistency requirement prevents you from solving the problems you set out to solve, then it seems like a pretty useless requirement for ethical reasoning.
If your values can be encoded into an expected utility function, then the goal of your ethics should be to maximize that function. The antecedent of this conditional could be reasonably disputed, but I think the conditional as a whole is fairly unobjectionable.
2. Maxim versus subjunctive dependence
One of the most common retorts to Kant’s formulation of the categorical imperative rests on the ambiguity of the term ‘maxim’.
For Kant, your maxim is supposed to be the motivating principle behind your action. It can be thought of as a general rule that determines the contexts in which you would take this action.
If your action is to donate money, then your maxim might be to give 10% of your income to charity every year. If your action is to lie to your boss about why you missed work, then your maxim might be to be dishonest whenever doing otherwise will damage your career prospects.
Now, the maxim is the thing that is universalized, not the action itself. So you don’t suppose that everybody suddenly starts lying to their boss. Instead, you imagine that anybody in a situation where being honest would hurt their career prospects begins lying.
In this situation, Kant would argue that if nobody was honest in these situations, then their bosses would just assume dishonesty, in which case, the employees would never even get the chance to lie in the first place. This is self-undermining; hence, forbidden!
I actually like this line of reasoning a lot. Scott Alexander describes it as similar to the following rule:
Don’t do things that undermine the possibility to offer positive-sum bargains.
Coordination problems arise because individuals decide to defect from optimal equilibriums. If these defectors were reasoning from the Kantian principle of universalizability, they would realize that if everybody behaved similarly then the ability to defect might be undermined.
But the problem largely lies in how one specifies the maxim. For example, compare the following two maxims:
Maxim 1: Lie to your boss whenever being honest would hurt your career opportunities.
Maxim 2: Lie to your boss about why you missed work whenever the real reason is that you went on all-night bar-hopping marathon with your friends Jackie and Khloe and then stayed up all night watched Breaking Bad highlight clips on your Apple TV.
If Maxim 2 is the true motivating principle of your action, then it seems a lot less obvious that the action is a violation of the categorical imperative. If only people in this precisely specified context lied to their bosses, then bosses would overall probably not become less trusting of their employees (unless your boss knows an unusual amount about your personal life). So the maxim is not self-undermining under universalization, and is therefore not forbidden.
Under Maxim 1, lying is forbidden, and under Maxim 2, it is not. But what is the true maxim? There’s no clear answer to this question. Any given action can be truthfully described as arising from numerous different motivational schema, and in general these choices will result in a variety of different moral guidelines.
In TDT, the analog to the concept of a maxim is subjunctive dependence, and this can be defined fairly precisely, without ambiguity. Subjunctive dependence between agents in a given context is just the degree of evidence you get about the actions of an agent given information about the actions of the other agents in that context.
More precisely, it is the degree of non-causal dependence between the actions of agents. It essentially arises from the fact that in a lawful physical universe, similar initial conditions will result in similar final conditions. This can be worded as similarity in initial conditions, in input-output structure, in computational structure, or in logical structure, but the basic idea is the same.
Not only is this precisely defined, it is a real dependence. You don’t have to imagine a fictional universe in which your action makes it more likely that others will act similarly; the claim of TDT is that this is actually the case!
In this sense, TDT is rooted in a simple acknowledgement of dependencies that really do exist and that can be precisely defined, while Kant’s categorical imperative relies on the ambiguous notion of a maxim, as well as a seemingly arbitrary hypothetical consideration. One might be tempted to ask: “Who cares what would happen if hypothetically everybody else acted according to a similar maxim? We should care about is what will actually happen in the real world; we shouldn’t be basing our decisions off of absurd hypothetical worlds!”
3. Difference in application
These two theoretical reasons are fairly convincing to me that Kantianism and TDT ethics are only superficially similar, and are theoretically quite different. But there still remains a question of how much the actual “outputs” of the two frameworks converge. Do they just end up giving similar ethical advice?
I don’t think so. First, consider the issue I touched on previously. I said that defectors that paid attention to the categorical imperative would rethink their decision, because it is not universalizable. But this is not in general true.
If defectors are always free to defect, regardless of how many others defect as well, then defecting will still be universalizable! It is only in special cases that Kantians will not defect, like when a mob boss will come in and necessitate cooperation if enough people defect, or where universal defection depletes an expendable resource that would otherwise be renewable.
The set of coordination problems in which defecting automatically becomes impossible at a certain point are the easiest cases of coordination problems. It’s much harder to get individuals to coordinate if there is no mob boss to step in and set everybody right. These are the cases where Kantianism fails, and TDT succeeds.
TDTists with shared goals for whom cooperation would be more effective for achieving these goals would always cooperate, even if “each would individually be better off” if they defected. (I put scare quotes because you only come to this conclusion by ignoring important dependencies in the problem).
The key difference here comes down again to #1: timeless decision theorists maximize expected utility, not Kantian consistency.
In addition, TDTists don’t necessarily have Kantian hangups about using people as means to an end: if doing so ends up producing a higher expected utility than not, then they’ll go for it without hesitation.
A TDTist that can save two people’s lives by causing a little harm to one person would probably do it if their utility function was relatively impartial and placed a positive value on life. A Kantian would forbid this act.
(Why? Well, Kant thought that this principle of treating people as ends in themselves rather than means to an end was equivalent to the universalizability principle, and as far as I know, pretty much nobody was convinced by his argument for why this was the case. As such, a lot of Kantian ethics looks like it doesn’t actually follow from the universalizability principle.)
An application that might be similar for Kantian ethics and TDT ethics is the treatment of dishonesty and deception. Kant famously forbid any lying of any kind, regardless of the consequences, on the basis that universal lying would undermine the trust that is necessary to make lying a possibility.
One can imagine a similar case made for honesty in TDT ethics. In a society of TDTs, a decision to lie is a decision to produce a society that is overall less honest and less trusting. In situations where the individual benefits of dishonesty are zero-sum, only the negative effects of dishonesty are amplified. This could plausibly make dishonesty on the whole a net negative policy.