Policy compression: a framework for resource-bounded decision making with insights into psychiatric disease
Reinforcement learning models typically assume biological agents make decisions to maximize reward. However, humans and other animals frequently behave suboptimally, suggesting they may be optimizing a different objective function. The framework of policy compression may provide some insight, by assuming agents not only seek to maximize reward but also seek to minimize cognitive cost, which is formalized as policy complexity (the mutual information between actions and states of the environment). I will first show how policy compression explains undermatching, a ubiquitous behavioral suboptimality observed in generalized matching tasks. I will then attempt to extend policy compression towards explaining psychopathology. Taken together, policy compression and other frameworks of resource-rational decision making may provide insight into psychiatric disease.