Defining one's boundaries: Exploration of the decision threshold parameter space during a reward maximisation task.
Models of decision-making assume an accumulation-to-threshold mechanism, whereby an individual pre-selects a single decision threshold such that the speed and accuracy of their responses are balanced. However, when the goal is to maximise an outcome variable (such as reward rate), it is very likely that the decision maker would keep adjusting the initially adopted threshold until satisfactory performance is reached. The standard assumption of stationarity leads to a threshold estimate that reflects the averaged performance, which may not necessarily be representative of the participant’s strategy at any one time. We analysed data from an expanded judgment task where the goal was to maximize reward rate, and estimated the height and slope of the decision threshold in a sliding window of trials, as well as over all trials. The overall best-fitting threshold parameters of a participant were often not representative of the estimated thresholds used in smaller windows of trials at any point during the experiment. This is largely because the majority of participants explored the threshold parameter space throughout the task, rather than settling on a specific threshold early on. Importantly, this exploration was not driven by the reward rate that a particular threshold yielded – in fact, the exploration often resulted in a lower average reward rate late in the experiment, relative to early trial windows. As such, participants failed to approach the threshold parameters that were optimal with respect to the task – i.e. those that would maximize reward rate. Our findings indicate that participants sample various distinct decision thresholds during a reward optimization task, rather than adopting a single threshold as is frequently assumed by models of decision-making. These results also introduce the question of whether such exploration is random, or whether it is modulated by a different variable (other than reward rate) that decision-makers prioritise instead.