A model-based examination of scale effects in student evaluations of teaching
Student evaluations of teaching are widely used to assess instructors and courses. Using a model-based approach and Bayesian methods, we examine how the direction of the scale, labels on scales, and the number of options impact ratings. We conduct a within-participants experiment in which respondents evaluate instructors and lectures using different scales. We find people tend to give positive ratings, especially when using letter scales compared to number scales. Furthermore, people tend to use the end-points less often when a scale is presented in reverse. Our model-based analysis allows us to infer how the features of scales shift responses to higher or lower ratings and how they compress scale use to make end-point responses more or less likely. The model also makes predictions about equivalent ratings across scales, which we demonstrate using real-world evaluation data. Our study has implications for the design of scales and for their use in assessment.