The American Academy of Sleep Medicine Inter-scorer Reliability program: respiratory events
- PMID: 24733993
- PMCID: PMC3960390
- DOI: 10.5664/jcsm.3630
The American Academy of Sleep Medicine Inter-scorer Reliability program: respiratory events
Abstract
Study objectives: The American Academy of Sleep Medicine (AASM) Inter-scorer Reliability program provides a unique opportunity to compare a large number of scorers with varied levels of experience to determine agreement in the scoring of respiratory events. The objective of this paper is to examine areas of disagreement to inform future revisions of the AASM Manual for the Scoring of Sleep and Associated Events.
Methods: The sample included 15 monthly records, 200 epochs each. The number of scorers increased steadily during the period of data collection, reaching more than 3,600 scorers by the final record. Scorers were asked to identify whether an obstructive, mixed, or central apnea; a hypopnea; or no event was seen in each of the 200 epochs. The "correct" respiratory event score was defined as the score endorsed by the most scorers. Percentage agreement with the majority score was determined for each epoch and the mean agreement determined.
Results: The overall agreement for scoring of respiratory events was 93.9% (κ = 0.92). There was very high agreement on epochs without respiratory events (97.4%), and the majority score for most of the epochs (87.8%) was no event. For the 364 epochs scored as having a respiratory event, overall agreement that some type of respiratory event occurred was 88.4% (κ = 0.77). The agreement for epochs scored as obstructive apnea by the majority was 77.1% (κ = 0.71), and the most common disagreement was hypopnea rather than obstructive apnea (14.4%). The agreement for hypopnea was 65.4% (κ = 0.57), with 16.4% scoring no event and 14.8% scoring obstructive apnea. The agreement for central apnea was 52.4% (κ = 0.41). A single epoch was scored as a mixed apnea by a plurality of scorers.
Conclusions: The study demonstrated excellent agreement among a large sample of scorers for epochs with no respiratory events. Agreement for some type of event was good, but disagreements in scoring of apnea vs. hypopnea and type of apnea were common. A limitation of the analysis is that most of the records had normal breathing. A review of controversial events yielded no consistent bias that might be resolved by a change of scoring rules.
Keywords: Scoring; apnea; hypopnea; reliability; respiratory events.
Figures
References
-
- American Academy of Sleep Medicine. Westchester, IL: American Academy of Sleep Medicine; 2005. International Classification of Sleep Disorders, Second Edition: Diagnostic and Coding Manual.
-
- Iber C, Ancoli-Israel S, Chesson AL, Jr., Quan SF. 1st ed. Westchester, IL: American Academy of Sleep Medicine; 2007. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology, and Technical Specifications.
-
- Whitney CW, Gottlieb DJ, Redline S, et al. Reliability of scoring respiratory disturbance indices and sleep staging. Sleep. 1998;21:749–57. - PubMed
-
- Redline S, Budhiraja R, Kapur V, et al. The scoring of respiratory events in sleep: reliability and validity. J Clin Sleep Med. 2007;3:169–200. - PubMed
-
- Pittman SD, MacDonald MM, Fogel RB, et al. Assessment of automated scoring of polysomnographic recordings in a population with suspected sleep-disordered breathing. Sleep. 2004;27:1394–403. - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
