Why infinite utility is like the Spanish inquisition

Very often in philosophical discussions involving decision theory or ethics, someone will attempt to show that some action has infinite expected utility and try to draw some conclusions from this proposition. Or at least confuse the hell out of everyone. The most common culprit is (some versions of) Pascal’s wager, but the St. Petersburg paradox is another. What I want to do is convince you to be extremely skeptical of any conclusions that depend on something or another having infinite expected utility. To do this, we’re going to have to learn some of the nitty-gritty math of decision theory – yes, this is partially just an excuse to talk about decision theory. There won’t be much math, but when there is the details will be important. (I wrote that last sentence before I wrote the rest and let’s just say I was a bit optimistic.)

What the hell is utility anyway?

First, we have to start at the beginning – what’s utility? Why do we expect it? What? Ok, forget about utility for now, first we’re going to talk about preferences. Mathematically, we need two things to have preferences, a set of things to have preferences over, and something that tells us which of those things are preferred to other things. The first is called the choice set, which we’ll use $X$ to represent. The second is a preference relation, which we’ll use $\succ$ or $\succeq$ to represent. Suppose that $(X,\succeq)$ is Bob’s preferences. If $x_1 \in X$ and $x_2 \in X$, i.e. they are options in the choice set, then $x_1 \succ x_2$ means that Bob prefers $x_1$ to $x_2$. On the other hand $x_1 \succeq x_2$ means that Bob is either indifferent between $x_1$ and $x_2$ – he wouldn’t care which of these two options you gave him – or he prefers $x_1$ to $x_2$. So $\succ$ is a bit like $>$ and $\succeq$ is a bit like $\geq$, but they don’t mean the same thing. For example, our choice set could be $X={0,1,2,3,…}$ where $x\in X$ is the number of times Bob gets hit in the head by a hammer – clearly $5 > 4$, but unless we think Bob has strange preferences, $4 \succ 5$. To simplify notation a bit, we also have the symbols $\sim$, $\prec$, and $\preceq$ which mean pretty much exactly what you think they mean, e.g. $x_1 \preceq x_2$ means that $x_2 \succeq x_1$, i.e. Bob either prefers $x_2$ to $x_1$ or is indifferent between them, and $x_1 \sim x_2$ means that both $x_1 \succeq x_2$ and $x_2 \succeq x_1$, i.e. that Bob is indifferent between $x_1$ and $x_2$.

That’s not all, we also need the some constraints on the relation $\succeq$ in order for the preference relation to be “rational” – we need $\succeq$ to be complete and transitive. The relation $\succeq$ is complete if it ranks any two options in the choice set, i.e. if $x_1$ and $x_2$ are possible options, then either $x_1 \succeq x_2$, $x_2 \succeq x_1$, or both. This constraint basically says that there’s always an answer to the question “which one do you want Bob?” Bob doesn’t have to know the answer to this question with 100% certainty, but we often assume that he does. Bob is a pretty smart guy. I bet he even went to college. The transitivity constraint basically says that Bob’s preferences are consistent. Formally, they say that if $x_1 \succeq x_2$ and $x_2 \succeq x_3$ then $x_1 \succeq x_3$. This constraint is to prevent preference circles – e.g. where Bob would rather eat an apple than a banana, he’d rather eat a banana than an orange, and he’d rather eat an orange than an apple. So not only is Bob a pretty smart guy, but he’s also not insane. So he’s not the unabomber. That’s a good thing, right?

You may be wondering at this whether these preferences can capture some notion of how much Bob prefers bananas to oranges. The answer is that they can’t. This isn’t to say that there isn’t a precise answer to this question; we’ve just abstracted away from those details. We’ll eventually be able to capture some of this notion of “how much more” something is preferred once we talk about uncertainty, but for now the theory has limits.

No, really, I thought you were going to tell me about utility

Now that we have the mathematical structure, we can talk about Bob’s preferences. We can throw a bunch of possible choices at him, and he can rank them for us. “An apple is better than an orange, which is at least as good as a banana, which is better than a grape…” But working directly with preferences is a big pain in the butt – not for Bob, but for us, talking about Bob. The problem is that preferences have this strange, unique structure that isn’t always easy to analyze. The solution is to come up with a way to represent Bob’s preferences using a mathematical structure that is easy to play around with. This is what a utility function is – a convenient representation of preferences, but not the preferences themselves. Mathematically, a utility function is a function (duh) $U:X\to \Re$ that takes an option from the choice set and gives a number to it. But not just any old number – the numbers have to agree with the underlying preference relation, i.e. if $x_1\succeq x_2$ then $U(x_1) \geq U(x_2)$. So instead of having to look at the preference relation, we can just look at this list of numbers associated with each option in the choice set. If the the utility function is “nice”, e.g. continuous and differentiable, then we can do all sorts of fun mathy things with it.

Not every preference relation can be represented by a utility function though. I won’t go into detail, but if the preference relation is continuous in some sense that I’m not willing to define right now, a utility function exists. The classic example of a preference relation which doesn’t admit a utility function representation is lexicographic preferences. Suppose the choice set consists of the number of apples that Bob gets to consume and the number of bananas he gets to consume, so a typical element would be something like $(x,y)$ where $x$ is the number of apples and $y$ is the number of bananas. We’ll assume that $(x_1, y_1)\succeq (x_2,y_2)$ any time $x_1 > x_2$ and also when $x_1 = x_2$ and $y_1 \geq y_2$. So basically if you offer Bob the choice between two fruit baskets filled with apples and bananas, the first thing he does is count the number of apples in the basket. If one basket has more apples than the other, he takes that one, if they’re tied, he takes the basket with more bananas. Bananas are nothing more than tiebreakers. We can’t represent this preference relation by assigning a single number to each option without somehow losing information. Two numbers would work, but one is not enough.

A second thing to worry about is that when a utility function exists for some preference relation, it isn’t unique. Suppose the choice set is just two things – a banana and an apple, and Bob prefers the apple to the banana. We could assign $U(apple) = 1$ and $U(banana) = 0$, or $U(apple) = 1,000,000,000,000$ and $U(banana) = -\pi/2$, or anything else we wanted, as long as $U(apple) > U(banana)$. The numbers are meaningless apart from their order. In general if we have some utility function $U$ and another monotonic function $f$, then $f(U(x))$ is also a valid utility function, e.g. $log(U(x))$ or $e^{U(x)}$ or $aU(x) + b$ when $a>0$ and $b$ is any number. This reinforces the idea that a utility function is a representation of preferences and not the preferences themselves – in any given situation we may pick a different utility function merely because it’s convenient to work with.

I still don’t know how you expect utility

OK, to talk about expected utility, we have to take another detour. Up until now, we’ve assumed that Bob can directly choose outcomes. This makes sense when he’s picking between fruit baskets, but maybe not so much when Bob has to decide on which whether to keep or fold his hand in a game of Texas hold ’em. Yeah, Bob is a pretty cool guy. He plays poker all the time. We can easily modify our existing framework to deal with Bob’s gambling habit. Now we have some outcome space $O$ that denotes all of the possible things that can happen – i.e. does he win the hand or lose it? How much does he win or lose? But Bob can’t choose items directly from $O$ – he can’t choose how much he wins. Instead his options are whether to bet (and how much), call, or fold. Each option induces a probability distribution over $O$, over the amount of money he wins (or loses… and let’s be honest, Bob loses a lot) and he has to pick from among these options. So the solution is to treat Bob’s choice set as the set of all probability distributions over $O$, i.e. $X = \mathcal{L}(O)$. The script L stands for “lottery” and just indicates that we’re looking at the probabilities of the different elements (or subsets of) $O$ occurring. Now $x_1 \succ x_2$ means that the probability distribution over $O$ that $x_1$ represents is preferred to the distribution $x_2$ represents. E.g. if we’re talking about a coin flip where Bob wins $\$1$ he it lands heads and loses $\$1$ if it lands tails, $X=\mathcal{L}(O)$ represents the set of all possible probabilities of the coin landing on heads. So if $x_1 = .5$ and $x_2 = .4$, $x_1 \succ x_2$ since, we presume, Bob likes more money.

So if we suppose Bob has a preference relation $\succeq$ on the choice set $\mathcal{L}(O)$ we’re done right? Choice under uncertainty, got it. Well, not so fast. We’ve sneakily assumed that Bob only cares about the final probabilities of the outcomes of $O$ occurring and not how they’re constructed. For example suppose Bob is choosing whether to bet $\$1$ on the outcome of a coin flip, but he doesn’t know the probability that the coin comes up heads. We can represent this ignorance with a probability distribution over the probability that the coin lands heads – e.g. Bob may think that there’s a 50% chance that $P(HEADS) \geq .5$. What we’ve basically assumed is that we can average the coin’s probability over the distribution of coin-flip-probabilities and just look at the resulting marginal distribution over $O$. In other words, we can reduce $\mathcal{L}(\mathcal{L}(…\mathcal{L}(\mathcal{L}(O))…))$ to just $\mathcal{L}(O)$. This makes sense from the perspective of normative decision theory, but may not accurately describe human behavior, so whether or not this assumption is strong depends on the purpose you’re using decision theory for.

OK, OK, so we can define preferences over lotteries/gambles/probability distributions over the outcome set as long as we acknowledge some technicalities, and if the preferences are nice we get a utility function defined on these lotteries. Great, but we still aren’t expecting something. To get to expected utility, we need to acknowledge a huge computational problem we just introduced. Previously, our utility function took a single element from $X$, which could easily just be one number (i.e. consumption) or a couple of numbers (apples and oranges). Now that $X=\mathcal{L}(O)$, the utility function needs a bunch of numbers. Suppose $O$ contains $n$ elements, then our utility function depends on $n-1$ numbers because it takes into account a probability for each possible outcome. If $O$ is an infinite set, e.g. the possible amounts of money you can win in a bet, or the number of possible flipped coins landing heads in an infinite number of flips, then the utility function depends on an infinite number of numbers – suddenly it’s really hard to work with again.

A solution is this problem is to define another type of utility function, called a von Neumann – Morgenstern or VNM utility function. Under some much stronger conditions on the preference relation $\succeq$ over $X=\mathcal{L}(O)$, we can create a VNM utility function $u:O\to \Re$ where $U(x)=E_x[u(o)]$ for any probability distribution $x\in X=\mathcal{L}(O)$. OK, let’s unpack this. The VNM utility function $u$ does the same basic thing as utility functions always do, but this time to the outcome space instead of the choice space – it assigns number to outcomes. What do these numbers mean? Well if $x_1$ and $x_2$ are two probability distributions over the outcome space, then when $x_1 \succeq x_2$, $E_{x_1}[u(o)] \geq E_{x_2}[u(o)]$. Ok, what the hell does $E_{x_1}[u(o)]$ mean? It means that you take the VNM utility function $u(o)$ and average it using the probability distribution $x_1$. So, for example if there are two possibilities, heads and tails, and $u(heads) = 1$ while $u(tails) = 0$, then if $x_1 = p = P(heads)$, $E_{x_1}[u(0)] = p\times 1 + (1 – p)\times 0 = p$. We call this the expected utility given the probability distribution $x_1$. This object is often (if you do it right) pretty easy to work with.

Like the original utility function, the VNM utility function isn’t unique either in the sense that you can always find another function that represents preferences equally well. If $u(o)$ is a VNM utility function representing the preference relation $\succeq$, then so is $a\times u(o) + b$ for $a > 0$ and $b$ any real number. This also helps motivate why VNM utility helps us capture some of the “how much more” intuition about preferences. If $u(o_1)$ is much larger than $u(o_2)$ but only slightly larger than $u(o_3)$, this relative difference of differences is preserved by picking a different VNM utility function, so we’re able to coherently talk about Bob liking $o_1$ way more than $o_2$ but only a little bit more than $o_3$ without going outside of the math.

So, wait, infinite utility is like the Spanish inquisition?

Yep. Nobody expects infinite utility. Infinite utility is simply not amongst our weaponry – ahem – it’s simply not a possibility in the math we’ve gone through so far. Bob’s utility function over lotteries on the space of possible outcomes is $U:\mathcal{L}(O)\to \Re$, and his VNM utility function is $u:L\to O$ such that if $x\in \mathcal{L}(O)$ then $U(x) = E_x[u(o)]$. There’s no room for infinity here – Bob’s utility function assigns a number to each lottery on $O$. Infinity is not a number, QED, right? Well, no, there’s more to this story worth mentioning. The requirement that $U$ only give real numbers to lotteries ends up imposing a pretty strong restriction on the VNM utility function $u$ – it has to be bounded. In other words there has to be some number $M$ such that $u(o)\leq M$ for all $o\in O$. You can relax this restriction if you only allow some of the lotteries in $\mathcal{L}(O)$, but fundamentally, whichever restriction you choose is there to prevent an infinite expected utility.

On the other hand, there’s no reason we can’t go back to the beginning and allow our utility function to assign each element of the choice set a number or positive/negative infinity, i.e. $U:X\to \Re\cup\{-\infty,\infty\}$. Now we can allow infinite expected utility if we like. But note what we’ve done – in order for our utility function to be faithfully representing Bob’s preferences, $U(x)=E_x[u(o)]=\infty$ means that $x$ is literally (literally literally) the best probability distribution over $O$ according to Bob’s preferences, and if there are two such $x$’s, they are tied for the best. And we can’t rule out ties either. The same thing holds for $-\infty$, except this time $x$ is (tied for) the worst. You can do this if you want, but you have to be careful with the infinities and not overinterpret them. Remember, they’re supposed to be representing preferences. If at any time you see infinite expected utility, what’s more likely? That you took all the proper precautions so that this infinity is faithfully representing the preferences of our good friend Bob? Or that you just waved your hands and turned Bob into a theist without his consent?


Leave a Reply

Your email address will not be published. Required fields are marked *