Validation of undergraduate medical student script concordance test (SCT) scores on the clinical assessment of the acute abdomen

BMC Surg. 2016 Aug 17;16(1):57. doi: 10.1186/s12893-016-0173-y.

Abstract

Background: Health professionals often manage medical problems in critical situations under time pressure and on the basis of vague information. In recent years, dual process theory has provided a framework of cognitive processes to assist students in developing clinical reasoning skills critical especially in surgery due to the high workload and the elevated stress levels. However, clinical reasoning skills can be observed only indirectly and the corresponding constructs are difficult to measure in order to assess student performance. The script concordance test has been established in this field. A number of studies suggest that the test delivers a valid assessment of clinical reasoning. However, different scoring methods have been suggested. They reflect different interpretations of the underlying construct. In this work we want to shed light on the theoretical framework of script theory and give an idea of script concordance testing. We constructed a script concordance test in the clinical context of "acute abdomen" and compared previously proposed scores with regard to their validity.

Methods: A test comprising 52 items in 18 clinical scenarios was developed, revised along the guidelines and administered to 56 4(th) and 5(th) year medical students at the end of a blended-learning seminar. We scored the answers using five different scoring methods (distance (2×), aggregate (2×), single best answer) and compared the scoring keys, the resulting final scores and Cronbach's α after normalization of the raw scores.

Results: All scores except the single best answers calculation achieved acceptable reliability scores (>= 0.75), as measured by Cronbach's α. Students were clearly distinguishable from the experts, whose results were set to a mean of 80 and SD of 5 by the normalization process. With the two aggregate scoring methods, the students' means values were between 62.5 (AGGPEN) and 63.9 (AGG) equivalent to about three expert SD below the experts' mean value (Cronbach's α : 0.76 (AGGPEN) and 0.75 (AGG)). With the two distance scoring methods the students' mean was between 62.8 (DMODE) and 66.8 (DMEAN) equivalent to about two expert SD below the experts' mean value (Cronbach's α: 0.77 (DMODE) and 0.79 (DMEAN)). In this study the single best answer (SBA) scoring key yielded the worst psychometric results (Cronbach's α: 0.68).

Conclusion: Assuming the psychometric properties of the script concordance test scores are valid, then clinical reasoning skills can be measured reliably with different scoring keys in the SCT presented here. Psychometrically, the distance methods seem to be superior, wherein inherent statistical properties of the scales might play a significant role. For methodological reasons, the aggregate methods can also be used. Despite the limitations and complexity of the underlying scoring process and the calculation of reliability, we advocate for SCT because it allows a new perspective on the measurement and teaching of cognitive skills.

Keywords: Acute abdomen; Assessment; Clinical reasoning; Medical education; Scales; Script concordance test; Surgery.

MeSH terms

  • Abdomen, Acute / diagnosis*
  • Abdomen, Acute / etiology
  • Abdomen, Acute / surgery*
  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Child
  • Education, Medical, Undergraduate*
  • Educational Measurement / methods*
  • Female
  • Humans
  • Male
  • Middle Aged
  • Problem-Based Learning
  • Psychometrics
  • Reproducibility of Results
  • Thinking*
  • Young Adult