An integrated measure of model complexity
If two models account for data equally well, it is widely accepted that we should select the simplest one. One way to formalize this principle is through measures of model complexity that quantify the range of outcomes a model predicts. According to Roberts and Pashler (2000), however, this is only part of the story. They emphasize that a simple model is one that is falsifiable because it makes surprising predictions, which requires measuring how likely it is that data could have been observed that the model does not predict. We propose a new measure that includes both of these criteria, based on Kullback-Leibler (KL) divergence. Our measure involves the models’ prior predictive distributions, which corresponds to the range of predictions they make, and a data prior, which corresponds to the range of possible observable outcomes in an experiment designed to evaluate the models. We propose that model A is simpler than model B if the KL divergence from the prior predictive distribution of model A to the data prior is greater than that of model B. This measure formalizes the idea that a model is simpler if its predictions are more surprising and more falsifiable. To demonstrate this new measure, we present a worked example involving competing models of the widely-studied memory process of free recall. The example involves a data prior based on the empirical regularity provided by the serial position curve. We show how the data prior helps measure aspects of model complexity not captured by measuring the range of predictions made by models, and influences which model is chosen.
Keywords
Nice talk on the tough problem of incorporating data priors. Do you think of the challenge as identifying or excluding information about data patterns that are implausible (e.g., participants could not generate them), or including information about data patterns that plausible? I usually feel more confident in the former unless outcomes are fairly...
Cite this as: