The GRACE checklist for rating the quality of observational studies of comparative effectiveness: a tale of hope and caution
- PMID: 24564810
- PMCID: PMC10437555
- DOI: 10.18553/jmcp.2014.20.3.301
The GRACE checklist for rating the quality of observational studies of comparative effectiveness: a tale of hope and caution
Abstract
Background: While there is growing demand for information about comparative effectiveness (CE), there is substantial debate about whether and when observational studies have sufficient quality to support decision making.
Objective: To develop and test an item checklist that can be used to qualify those observational CE studies sufficiently rigorous in design and execution to contribute meaningfully to the evidence base for decision support.
Methods: An 11-item checklist about data and methods (the GRACE checklist) was developed through literature review and consultation with experts from professional societies, payer groups, the private sector, and academia. Since no single gold standard exists for validation, checklist item responses were compared with 3 different types of external quality ratings (N=88 articles). The articles compared treatment effectiveness and/or safety of drugs, medical devices, and medical procedures. We validated checklist item responses 3 ways against external quality ratings, using published articles of observational CE or safety studies: (a) Systematic Review-quality assessment from a published systematic review; (b) Single Expert Review-quality assessment made according to the solicited "expert opinion" of a senior researcher; and (c) Concordant Expert Review-quality assessments from 2 experts for which there was concordance. Volunteers (N=113) from 5 continents completed 280 article assessments using the checklist. Positive and negative predictive values (PPV, NPV, respectively) of individual items were estimated to compare testers' assessments with those of experts.
Results: Taken as a whole, the scale had better NPV than PPV, for both data and methods. The most consistent predictor of quality relates to the validity of the primary outcomes measurement for the study purpose. Other consistent markers of quality relate to using concurrent comparators, minimizing the effects of bias by prudent choice of covariates, and using sensitivity analysis to test robustness of results. Concordance of expert opinion on the quality of the rated articles was 52%; most checklist items performed better.
Conclusions: The 11-item GRACE checklist provides guidance to help determine which observational studies of CE have used strong scientific methods and good data that are fit for purpose and merit consideration for decision making. The checklist contains a parsimonious set of elements that can be objectively assessed in published studies, and user testing shows that it can be successfully applied to studies of drugs, medical devices, and clinical and surgical interventions. Although no scoring is provided, study reports that rate relatively well across checklist items merit in-depth examination to understand applicability, effect size, and likelihood of residual bias. The current testing and validation efforts did not achieve clear discrimination between studies fit for purpose and those not, but we have identified a critical, though remediable, limitation in our approach. Not specifying a specific granular decision for evaluation, or not identifying a single study objective in reports that included more than one, left reviewers with too broad an assessment challenge. We believe that future efforts will be more successful if reviewers are asked to focus on a specific objective or question. Despite the challenges encountered in this testing, an agreed upon set of assessment elements, checklists, or score cards is critical for the maturation of this field. Substantial resources will be expended on studies of real-world effectiveness, and if the rigor of these observational assessments cannot be assessed, then the impact of the studies will be suboptimal. Similarly, agreement on key elements of quality will ensure that budgets are appropriately directed toward those elements. Given the importance of this task and the lessons learned from these extensive efforts at validation and user testing, we are optimistic about the potential for improved assessments that can be used for diverse situations by people with a wide range of experience and training. Future testing would benefit by directing reviewers to address a single, granular research question, which would avoid problems that arose by using the checklist to evaluate multiple objectives, by using other types of validation test sets, and by employing further multivariate analysis to see if any combination or sequence of item responses has particularly high predictive validity.
Similar articles
-
The GRACE Checklist: A Validated Assessment Tool for High Quality Observational Studies of Comparative Effectiveness.J Manag Care Spec Pharm. 2016 Oct;22(10):1107-13. doi: 10.18553/jmcp.2016.22.10.1107. J Manag Care Spec Pharm. 2016. PMID: 27668559 Free PMC article.
-
The future of Cochrane Neonatal.Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12. Early Hum Dev. 2020. PMID: 33036834
-
The Effectiveness of Integrated Care Pathways for Adults and Children in Health Care Settings: A Systematic Review.JBI Libr Syst Rev. 2009;7(3):80-129. doi: 10.11124/01938924-200907030-00001. JBI Libr Syst Rev. 2009. PMID: 27820426
-
A Process for Robust and Transparent Rating of Study Quality: Phase 1 [Internet].Rockville (MD): Agency for Healthcare Research and Quality (US); 2011 Nov. Report No.: 12-EHC004-EF. Rockville (MD): Agency for Healthcare Research and Quality (US); 2011 Nov. Report No.: 12-EHC004-EF. PMID: 22191113 Free Books & Documents. Review.
-
Avoiding and identifying errors in health technology assessment models: qualitative study and methodological review.Health Technol Assess. 2010 May;14(25):iii-iv, ix-xii, 1-107. doi: 10.3310/hta14250. Health Technol Assess. 2010. PMID: 20501062 Review.
Cited by
-
Why We Should Not Be Indifferent to Specification Choices for Difference-in-Differences.Health Serv Res. 2015 Aug;50(4):1211-35. doi: 10.1111/1475-6773.12270. Epub 2014 Dec 11. Health Serv Res. 2015. PMID: 25495529 Free PMC article.
-
Sacrospinous hysteropexy: review and meta-analysis of outcomes.Int Urogynecol J. 2017 Sep;28(9):1285-1294. doi: 10.1007/s00192-017-3291-x. Epub 2017 Mar 3. Int Urogynecol J. 2017. PMID: 28258346 Review.
-
Online tools to synthesize real-world evidence of comparative effectiveness research to enhance formulary decision making.J Manag Care Spec Pharm. 2021 Jan;27(1):95-104. doi: 10.18553/jmcp.2021.27.1.095. J Manag Care Spec Pharm. 2021. PMID: 33377442 Free PMC article.
-
Treatment Outcomes of Epinephrine for Traumatic Out-of-hospital Cardiac Arrest: A Systematic Review and Meta-analysis.J Emerg Trauma Shock. 2021 Oct-Dec;14(4):195-200. doi: 10.4103/JETS.JETS_35_21. Epub 2021 Nov 23. J Emerg Trauma Shock. 2021. PMID: 35125783 Free PMC article.
-
C-REGS 2 - Design and methodology of a high-quality comparative effectiveness observational trial.J Med Life. 2021 Sep-Oct;14(5):700-709. doi: 10.25122/jml-2021-0362. J Med Life. 2021. PMID: 35027974 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources