Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1998 Jun;12(3):187-99.
doi: 10.1191/026921598672178340.

Reliability of assessment tools in rehabilitation: an illustration of appropriate statistical analyses

Affiliations

Reliability of assessment tools in rehabilitation: an illustration of appropriate statistical analyses

G Rankin et al. Clin Rehabil. 1998 Jun.

Abstract

Objective: To provide a practical guide to appropriate statistical analysis of a reliability study using real-time ultrasound for measuring muscle size as an example.

Design: Inter-rater and intra-rater (between-scans and between-days) reliability.

Subjects: Ten normal subjects (five male) aged 22-58 years.

Method: The cross-sectional area (CSA) of the anterior tibial muscle group was measured using real-time ultrasonography.

Main outcome measures: Intraclass correlation coefficients (ICCs) and the 95% confidence interval (CI) for the ICCs, and Bland and Altman method for assessing agreement, which includes calculation of the mean difference between measures (d), the 95% CI for d, the standard deviation of the differences (SDdiff), the 95% limits of agreement and a reliability coefficient.

Results: Inter-rater reliability was high, ICC (3,1) was 0.92 with a 95% CI of 0.72 --> 0.98. There was reasonable agreement between measures on the Bland and Altman test, as d was -0.63 cm2, the 95% CI for d was -1.4 --> 0.14 cm2, the SDdiff was 1.08 cm2, the 95% limits of agreement -2.73 --> 1.53 cm2 and the reliability coefficient was 2.4. Between-scans repeatability was high, ICCs (1,1) were 0.94 and 0.93 with 95% CIs of 0.8 --> 0.99 and 0.75 --> 0.98, for days 1 and 2 respectively. Measures showed good agreement on the Bland and Altman test: d for day 1 was 0.15 cm2 and for day 2 it was -0.32 cm2, the 95% CIs for d were -0.51 --> 0.81 cm2 for day 1 and -0.98 --> 0.34 cm2 for day 2; SDdiff was 0.93 cm2 for both days, the 95% imits of agreement were -1.71 --> 2.01 cm2 for day 1 and -2.18 --> 1.54 cm2 for day 2; the reliability coefficient was 1.80 for day 1 and 1.88 for day 2. The between-days ICC (1,2) was 0.92 and the 95% CI 0.69 --> 0.98. The d was -0.98 cm2, the SDdiff was 1.25 cm2 with 95% limits of agreement of -3.48 --> 1.52 cm2 and the reliability coefficient 2.8. The 95% CI for d (-1.88 --> -0.08 cm2) and the distribution graph showed a bias towards a larger measurement on day 2.

Conclusions: The ICC and Bland and Altman tests are appropriate for analysis of reliability studies of similar design to that described, but neither test alone provides sufficient information and it is recommended that both are used.

PubMed Disclaimer

Publication types

LinkOut - more resources