A principal hires an agent to acquire information about an unknown state $\theta \in \Theta$ and take a decision $d \in D$. The agent chooses an experiment—a joint distribution $p(d, \theta)$ over decisions and states—subject to a cost function $c(p)$ and a capacity constraint.
The contract $b(d, \theta)$ specifies payments from principal to agent. Both parties are risk-neutral. The question is: what form does the optimal contract take, and how does it depend on the cost function?
For a general cost function, every Pareto optimal contract can be written as:
$$b(d, \theta) = \alpha^* y(d, \theta) - \beta(\theta) - \gamma(d, \theta)$$where:
The distortion $\gamma$ captures how complementarities in the cost of acquiring different signals in different states shape incentives. For a general cost function, $\gamma$ depends on both the decision $d$ and the state $\theta$.
Suppose the agent's cost function is expected reduction in Shannon entropy:
$$c(p) = H_S(\pi) - \sum_{d \in D} p(d)\, H_S(p(\cdot | d))$$where $H_S(q) = -\sum_{\theta} q(\theta) \log q(\theta)$ is Shannon entropy and $\pi$ is the prior.
Then the distortion term simplifies dramatically. It no longer depends on the state:
$$\gamma(d, \theta) \;\longrightarrow\; \hat{\gamma}(d) = \sum_{\bar\theta} \frac{\lambda[d, \bar\theta]}{p(d)}$$The optimal contract becomes:
The decision-dependent transfer $\hat{\gamma}(d)$ punishes decisions likely to make the agent's liability limits bind and rewards decisions likely to make the principal's liability limits bind.
This is the key result. Start from the general distortion term $\gamma(d, \theta)$. For it to collapse to a decision-dependent transfer $\hat{\gamma}(d)$ (independent of $\theta$), the information cost matrix must take the form:
$$k(\theta, \theta', p(\cdot|d)) = p(\theta|d)\, g(\theta', p(\cdot|d)) + \mathbf{1}_{\{\theta' = \theta\}}\, h(\theta, p(\cdot|d))$$In other words: Shannon entropy is not just a convenient cost function for information acquisition. It is the only cost function that produces the clean contract form above. Any other cost function introduces state-dependent distortions that complicate the optimal contract.
Shannon entropy is ubiquitous in information theory, statistical mechanics, and models of rational inattention (Sims, 2003). But its use is often justified by tractability or tradition. This result provides a structural justification: Shannon entropy is the unique cost function under which optimal incentive contracts take an intuitive, implementable form.
The result connects information theory to mechanism design: the mathematical structure that makes Shannon entropy special in coding theory (it is the unique measure satisfying certain axioms) is the same structure that makes it special in contract design.