Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1997;33(1):59-67.

Interrater reliability issues in multicenter trials, Part II: Statistical procedures used in Department of Veterans Affairs Cooperative Study #394

Affiliations
  • PMID: 9133752

Interrater reliability issues in multicenter trials, Part II: Statistical procedures used in Department of Veterans Affairs Cooperative Study #394

R Edson et al. Psychopharmacol Bull. 1997.

Abstract

The primary goal of Veterans Affairs (VA) Cooperative Study (CS) #394 is to determine if vitamin E is a safe and efficacious treatment for tardive dyskinesia (TD). The study uses various instruments to assess subjects for movement disorders (Abnormal Involuntary Movement Scale [AIMS], and Barnes Akathisia Scale [BAS]), psychopathology (Anchored Brief Psychiatric Rating Scale [BPRS]), and level of functioning (Global Assessment of Functioning scale [GAF]). Since the study involves nine sites, each with its own set of raters, it is important to establish and maintain high interrater reliability (IRR) on these instruments throughout the study and to identify raters who differ significantly from the others. To make this determination, personnel at each site assessed subjects from standardized videotapes on the AIMS, BAS, and Anchored BPRS, and rated written vignettes on the GAF. We fit these data to a two-way additive model to identify nonstandardized raters (i.e., those whose average ratings were significantly lower or higher than the others, or those whose scores, after adjusting for subject and rater effects, were highly variable). The proportion of nonstandardized raters ranged from 7 percent (Anchored BPRS) to 33 percent (AIMS). The estimated intraclass correlation coefficients (ICCs) indicated moderate reliability for the AIMS, BAS, and Anchored BPRS (0.73 to 0.75) and excellent agreement for the GAF (0.90). The companion article (Part I: Tracy et al. 1997, page 53 of this issue) describes the procedures used to train the raters for this study.

PubMed Disclaimer

Publication types