Prof. Joe Houpt
Performing a complex task provides opportunities to switch between multiple subtasks, where the individual decides between remaining on the current task or selecting an alternate task, with large variation in individual behavior. Information foraging theory states that people adapt their strategies to maximize the amount of information gained per unit time, leading them to either exploit existing information or switch tasks and explore to gather more information. Evidence accumulation models provide additional insight into the cognitive mechanisms of decision making; for task-switching decisions, the speed of information processing describes the attractiveness of the alternate task, while the response threshold describes the inhibition to leave the ongoing task. This research uses two versions of the linear ballistic accumulator (LBA) to investigate task-switching decisions. We apply an existing multi-attribute, multi-alternative LBA, based on cumulative prospect theory, by assuming that exploitation corresponds to being risk adverse and exploration corresponds to risk seeking, to define attribute-level subjective values and weighting functions that capture an individual’s pre-decision task preference. We develop an alternate version, using expected gain from information processing theory, that explains subjective value and attention weight at the alternative-level to determine drift rates for the LBA model. We compare the two versions of the model to determine, using data previously collected during a complex task, if individuals base their task preference on individual task attributes or the overall gain provided by the task. We also compare response threshold results to the switch-avoidance tendency identified in a well-known task-switching model.
Dr. Erin McCormick
Dr. Leslie Blaha
Prof. Cleotilde (Coty) Gonzalez
In the optimal stopping problem, a decision maker aims to select the option that maximizes reward in a sequence, under the condition that they must select it at the time of presentation. Past literature suggests that people use a series of thresholds to make decisions (Lee, 2006), and researchers have developed a hierarchical Bayesian model, Bias-From-Optimal (BFO), to characterize these thresholds (Guan et al., 2015, 2020). BFO relies on optimal thresholds and the idea that people’s thresholds are characterized by how far they are from optimal and how this bias increases or decreases throughout the sequence. In this work, we challenge the assumption that people use thresholds to make decisions. We develop a cognitive model based on Instance-Based Learning Theory (Gonzalez et al., 2003) to demonstrate an inductive process by which individual thresholds are derived, without assuming that people use thresholds or relying on optimal thresholds. The IBL model makes decisions by considering the current value and the distance of its position from the end of the sequence, and learns through feedback from past decisions. Using this model, we simulate the choices of 56 individuals and compare these simulations with empirical data provided by Guan et al. (2020). Our results demonstrate that the IBL model replicates human behavior and generates the BFO model’s thresholds, without assuming any thresholds. Overall, our approach improves upon previous methods by producing cognitively plausible choices, resembling those of humans. The IBL model can therefore be used to predict human risk tendencies in sequential choice tasks.
Mr. Mark Zrubka
Dr. Dora Matzke
Response inhibition is frequently measured using the stop-signal paradigm, where responses must be withheld when a “stop” signal appears. This paradigm assumes that go and stop stimuli trigger competing runners. The first runner crossing a boundary wins, and determines whether a response is performed. A tension exists between two categories of models: descriptive and process models. Descriptive models define the speed of the runners, whereas process models express the latency of going (go RT) and stopping (stop-signal RT) in terms of psychological mechanisms and explain how their distributions emerge. One drawback of the process approach is an inability to recover data-generating parameters and thereby not qualifying as a measurement model. In contrast, the descriptive BEESTS approach recovers these parameters, but the psychological interpretation of its parameters is ambiguous which hampers the understanding of RT differences between groups or manipulations. We propose to mix a process “evidence-accumulation” account of the go runners and a descriptive approach of the stop runner. To instantiate this hybrid approach, we assumed Wald distributions for the finishing times of the go runners and, similar to BEESTS, an ex-Gaussian distribution for the stop runner. This approach results in a practically useful measurement model, with good parameter recovery by Bayesian hierarchical methods in realistic designs. By mixing racers, we garner advantages of both process and descriptive models: all parameters are interpretable in a measurement sense, parameters describing go runners are interpretable psychologically, and the stop parameters can be used to reliably and validly estimate stop-signal RTs.
Prof. Tobias Heed
Prof. Christoph Kayser
Reaction time (RT) series from any behavioural task show large fluctuations from trial to trial. These fluctuations are characterised by temporal trends such as positive autocorrelations between subsequent trials. In typical experimental paradigms, the trial-to-trial fluctuations are ignored, and RTs are summarised into conditional means, which are then statistically compared on the group level. However, at the level of individual participants, it often remains unknown which part of the total trial-to-trial variance is driven by the conditional manipulations. In the current study, we quantified sources of within-participant variance in RT across archival datasets. Specifically, we determined the relative contributions of experimental manipulations and sequential effects, split into trial-by-trial autocorrelations and blockwise trends. We quantified the trial-to-trial variance of RT with general linear models on the individual participant data. Results from 16 datasets (N = 1474) from perceptual and cognitive control tasks show that the conditional, autocorrelative, and blockwise trend factors explained similar amounts of variance in trial-to-trial RTs. Furthermore, we examined individual differences in explained variance with between-subject correlations between the amount of explained variance and performance. RT variability correlated positively with the amount of variance explained by the conditional and blockwise trend factors, but negatively with variance explained by the autocorrelative factors. Overall, experimental conditions only explained a small proportion of the total variance, and large parts of individual trial-by-trial variance remained unexplained by the investigated factors.
Response time models are typically applied to make predictions about when participants will react to a stimulus. However, many of the choices we make require us to proactively plan when to act: when to leave home to arrive somewhere on time, when to swing at a ball to hit it (tennis, baseball, cricket), and so on. These anticipatory responses can also be modeled as an evidence accumulation process, where we form joint expectations of both object and time. To understand these dynamic representations, participants were asked to make decisions about two anticipated events. In an initial experiment, they were asked to decide when a moving, partially occluded ball will hit a wall. In a second experiment, participants were asked to infer the ball’s position at a particular time after it became occluded behind the wall. We manipulated the speed at which the ball travels, its distance from the wall, and the time for which the ball is occluded, forcing the decision-maker to mentally represent and calculate the object’s dynamic position and motion as they form judgements about eventual locations or timing. We model response times using the extended Wald accumulator model to draw parallels between the changes in speed, distance, and length with the corresponding changes in the model’s parameters: drift, threshold, and non-decision time. The results from both tasks suggest that response times in anticipatory decisions are right-skewed -- mostly too slow -- and the estimated parameters of the Wald model successfully predicted individual response times by condition.
Ms. Manon Ironside
Dr. Sheri Johnson
How do we decide whether we should explore or exploit in uncertain environments where feedback is intermittent? In this talk, we compare two approaches to computational modeling of the cognitive process underlying such decisions, using control group data from an ongoing clinical research collaboration. Participants completed multiple blocks of the “observe or bet” task, which is a dynamic sequential decision-making task. To maximize reward, participants must strike a balance between betting on (but not seeing) which event will occur, versus observing events in the sequence (and forgoing gaining or losing points). Participants efficiently alternated between observing and betting, while overall observing more at the start of a sequence, and betting more towards the end. To better understand this data, we used two classes of hierarchical Bayesian models. First, we implemented nine versions of the “heuristic model” of this task, developed by Navarro, Newell, & Schulze (2016), which posits a cross-trial evidence accumulation process. Second, we implemented eight variants of a modified reinforcement learning (RL) model, which is a novel adaptation of Q-learning. Across all models, the modified RL model with counterfactual learning and a high fixed value of observing provided the best fit to the observed data. We discuss implications for modeling of this task, and for RL modeling more generally. We emphasize how this challenges a strict conceptualization of RL, as the modified RL model’s success suggests that the same computations responsible for learning from rewards might also subserve learning from outcomes that are non-extrinsically (but potentially intrinsically) rewarding.