Structure learning as a mechanism for overharvesting
In patch leaving problems, foragers must decide between engaging with a currently available, but depleting, patch of resources or foregoing it to search for another, potentially better patch. Overharvesting, or staying in the patch longer than what is optimally prescribed, is widely observed in these problems. Most previous explanations for this phenomenon focus on how foragers’ mis-estimations of the environment could produce overharvesting. They suggest that if the forager correctly learned the environment’s quality, then they would behave according to Marginal Value Theorem (MVT). However, this proposal rests on the assumption that the forager has full knowledge of the environment’s structure. Rarely does this occur in the real world. Instead, foragers must learn the structure of their environment. Here, we model foragers as pairing an optimal decision rule with an optimal learning procedure that allows for the possibility of heterogeneously-structured (i.e. multimodal) reward distributions. We then show that this model can appear to produce overharvesting, as measured by the common optimality criterion, when applied to the usual tasks, which employ homogeneous reward distributions. This model accounts for behavior in a previous serial stay/leave task, and generates novel predictions regarding sequential effects that agree with participant behavior. Taken together, these results are consistent with overharvesting reflecting optimality with respect to a different set of conditions than MVT and suggests that MVT’s definition of optimality may need to be adjusted to account for behavior in more naturalistic contexts.