Minimizing Interrater Variability in Staging Sleep by Use of Computer-Derived Features

J Clin Sleep Med. 2016 Oct 15;12(10):1347-1356. doi: 10.5664/jcsm.6186.

Abstract

Study objectives: Inter-scorer variability in sleep staging of polysomnograms (PSGs) results primarily from difficulty in determining whether: (1) an electroencephalogram pattern of wakefulness spans > 15 sec in transitional epochs, (2) spindles or K complexes are present, and (3) duration of delta waves exceeds 6 sec in a 30-sec epoch. We hypothesized that providing digitally derived information about these variables to PSG scorers may reduce inter-scorer variability.

Methods: Fifty-six PSGs were scored (five-stage) by two experienced technologists, (first manual, M1). Months later, the technologists edited their own scoring (second manual, M2). PSGs were then scored with an automatic system and the same two technologists and an additional experienced technologist edited them, epoch-by-epoch (Edited-Auto). This resulted in seven manual scores for each PSG. The two M2 scores were then independently modified using digitally obtained values for sleep depth and delta duration and digitally identified spindles and K complexes.

Results: Percent agreement between scorers in M2 was 78.9 ± 9.0% before modification and 96.5 ± 2.6% after. Errors of this approach were defined as a change in a manual score to a stage that was not assigned by any scorer during the seven manual scoring sessions. Total errors averaged 7.1 ± 3.7% and 6.9 ± 3.8% of epochs for scorers 1 and 2, respectively, and there was excellent agreement between the modified score and the initial manual score of each technologist.

Conclusions: Providing digitally obtained information about sleep depth, delta duration, spindles and K complexes during manual scoring can greatly reduce interrater variability in sleep staging by eliminating the guesswork in scoring epochs with equivocal features.

Keywords: PSG; automated scoring; interobserver variability; sleep stages.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Diagnosis, Computer-Assisted / methods*
  • Female
  • Humans
  • Male
  • Middle Aged
  • Observer Variation*
  • Polysomnography / methods*
  • Polysomnography / statistics & numerical data*
  • Reproducibility of Results
  • Sleep Stages / physiology*