If you want a continuously differentiable function f(x) from the reals to the reals that has the property that for all real x and y, f(xy) = f(x) + f(y), then this function must take the form f(x) = k log(x) for some real k.
A proof of this just popped into my head in the shower. (As always with shower-proofs, it was slightly wrong, but I worked it out and got it right after coming out).
I haven’t seen it anywhere before, and it’s a lot simpler than previous proofs that I’ve encountered.
f(xy) = f(x) + f(y)
differentiate w.r.t. x…
f'(xy) y = f'(x)
differentiate w.r.t. y…
f”(xy) xy + f'(xy) = 0
rearrange, and rename xy to z…
f”(z) = -f'(z)/z
solve for f'(z) with standard 1st order DE techniques…
df’/f’ = – dz/z
log(f’) = -log(z) + constant
f’ = constant/z
integrate to get f…
f(z) = k log(z) for some constant k
And that’s the whole proof!
As for why this is interesting to me… the equation f(xy) = f(x) + f(y) is very easy to arrive at in constructing functions with desirable features. In words, it means that you want the function’s outputs to be additive when the inputs are multiplicative.
One example of this, which I’ve written about before, is formally quantifying our intuitive notion of surprise. We formalize surprise by asking the question: How surprised should you be if you observe an event that you thought had a probability P? In other words, we treat surprise as a function that takes in a probability and returns a scalar value.
We can lay down a few intuitive desideratum for our formalization of surprise, and one such desideratum is that for independent events E and F, our surprise at them both happening should just be the sum of the surprise at each one individually. In other words, we want surprise to be additive for independent events E and F.
But if E and F are independent, then the joint probability P(E, F) is just the product of the individual probabilities: P(E, F) = P(E) P(F). In other words, we want our outputs to be additive, when our inputs are multiplicative!
This automatically gives us that the form of our surprise function must be k log(z). To spell it out explicitly…
Desideratum: Surprise(P(E, F)) = Surprise(P(E)) + Surprise(P(F))
But P(E,F) = P(E) P(F), so…
Surprise(P(E) P(F)) = Surprise(P(E)) + Surprise(P(F))
Renaming P(E) to x and P(F) to y…
Surprise(xy) = Surprise(x) + Surprise(y)
Thus, by the above proof…
Surprise(x) = k log(x) for some constant k
That’s a pretty strong constraint for some fairly weak inputs!
That’s basically why I find this interesting: it’s a strong constraint that comes out of an intuitively weak condition.