Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar;18(1):729-748.
doi: 10.1214/23-aoas1809. Epub 2024 Jan 31.

COMPOSITE SCORES FOR TRANSPLANT CENTER EVALUATION: A NEW INDIVIDUALIZED EMPIRICAL NULL METHOD

Affiliations

COMPOSITE SCORES FOR TRANSPLANT CENTER EVALUATION: A NEW INDIVIDUALIZED EMPIRICAL NULL METHOD

Nicholas Hartman et al. Ann Appl Stat. 2024 Mar.

Abstract

Risk-adjusted quality measures are used to evaluate healthcare providers with respect to national norms while controlling for factors beyond their control. Existing healthcare provider profiling approaches typically assume that the between-provider variation in these measures is entirely due to meaningful differences in quality of care. However, in practice, much of the between-provider variation will be due to trivial fluctuations in healthcare quality, or unobservable confounding risk factors. If these additional sources of variation are not accounted for, conventional methods will disproportionately identify larger providers as outliers, even though their departures from the national norms may not be "extreme" or clinically meaningful. Motivated by efforts to evaluate the quality of care provided by transplant centers, we develop a composite evaluation score based on a novel individualized empirical null method, which robustly accounts for overdispersion due to unobserved risk factors, models the marginal variance of standardized scores as a function of the effective sample size, and only requires the use of publicly-available center-level statistics. The evaluations of United States kidney transplant centers based on the proposed composite score are substantially different from those based on conventional methods. Simulations show that the proposed empirical null approach more accurately classifies centers in terms of quality of care, compared to existing methods.

Keywords: Empirical null; End-stage renal disease; Provider profiling; Unmeasured confounders.

PubMed Disclaimer

Figures

Fig 1.
Fig 1.
Funnel plots for four measures, using different standardization methods. TRR: Transplant Rate Ratio, SAR: Standardized Acceptance Ratio, PSMR: Patient Standardized Mortality Ratio, SGFR: Standardized Graft Failure Ratio. Centers that fall outside of the control limit lines are considered outliers. The solid and dotted lines correspond to the control limits based on fixed-effects standardization and empirical null standardization, respectively. The dot-dashed line represents the null value of one. One center with extremely large measure values is excluded for visual clarity.
Fig 2.
Fig 2.
Flagging probability for different quality of care effect sizes, with (a) no outliers and (b) 10% outliers. The solid, dashed, and dotted lines correspond to fixed-effects, method-of-moments, and empirical null standardization respectively. The fixed-effects approach does not account for overdispersion, the method-of-moments is an existing correction (Section 1), and the empirical null is the proposed correction.
Fig 3.
Fig 3.
(a): Flagging probability and (b): Average estimate of σαk2 (the variance of the unobserved quantity, αik), for different tuning parameter q values. The dashed and dotted lines correspond to method-of-moments and empirical null standardization respectively. In (b), the solid line corresponds to the true value of 0.14. The method-of-moments is an existing correction (Section 1), and the empirical null is the proposed correction.
Fig 4.
Fig 4.
Flagging probabilities based on a simulated composite score of two measures, for four centers of interest: Center 1 with poor access and poor outcomes, Center 2 with good access and good outcomes, Center 3 with poor access and good outcomes, and Center 4 with good access and poor outcomes. The horizontal axis corresponds to the quality of care effect size with respect to the first measure (i.e., γi1* in Section 2.2). The solid, dashed, and dotted lines correspond to fixed-effects, method-of-moments, and empirical null standardization respectively. The fixed-effects approach does not account for overdispersion, the method-of-moments is an existing correction (Section 1), and the empirical null is the proposed correction.
Fig 5.
Fig 5.
(Top Row): Descriptive analyses based on naive Z-scores, ZFE, using fixed-effects standardization, and (Bottom Row): Descriptive analyses based on the proposed Z-scores, ZEN, using individualized empirical null standardization. In each row, (Left): Histogram of Z-scores for the Transplant Rate Ratio measure; Solid curve: standard normal distribution. (Middle): Variance of Z-score within each group of transplant centers. (Right): Proportion of centers flagged for providing extremely poor or good care.
Fig 6.
Fig 6.
Plots of (a) Transplant Rate Ratio, (b) naive fixed-effects Z-score (ZFE), and (c) proposed empirical null Z-score (ZEN), against effective sample size. The sizes of the points are proportional to the effective sample size on the horizontal axis. In all plots, the solid line corresponds to the null value of the statistics, which represents average care that is consistent with national expectations.
Fig 7.
Fig 7.
Heat map of center performance for the 20 centers that would be flagged as poor-performers using historical methods (based on the PSMR and SGFR measures only). Lower ZEN values (and darker shades) correspond to worse quality of care for the corresponding measures. Centers that fall above the horizontal line are flagged as poor-performers by the proposed composite score approach. TRR: Transplant Rate Ratio, SAR: Standardized Acceptance Ratio, PSMR: Patient Standardized Mortality Ratio, SGFR: Standardized Graft Failure Ratio.
Fig 8.
Fig 8.
National distributions of the standardized Z-scores (ZEN) for the composite score and each measure component. Lower Z-score values correspond to worse quality of care for the corresponding measures. The Z-score for Center 18 from Figure 7 is marked with a dashed line. The dark shaded region of the top panel represents the flagging region based on a threshold of −1.96. TRR: Transplant Rate Ratio, SAR: Standardized Acceptance Ratio, PSMR: Patient Standardized Mortality Ratio, SGFR: Standardized Graft Failure Ratio.

References

    1. Angrist JD, Imbens GW and Rubin DB (1996). Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association 91 444–455.
    1. Ash AS, Fienberg SF, Louis TA, Normand ST, Stukel TA and Utts J (2012). Statistical Issues in Assessing Hospital Performance. Commissioned by the Committee of Presidents of Statistical Societies for the Centers for Medicare and Medicaid Services (CMS) [online], Available at https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Inst....
    1. Carlin BP and Louis TA (2000). Bayes and Empirical Bayes Methods for Data Analysis, 2nd ed. New York: Chapman & Hall/CRC.
    1. Chen Y, Rhee C, Senturk D, Kurum E, Campos L, Li Y, Kalantar-Zadeh K and Nguyen D (2019). Association of US Dialysis Facility Staffing with Profiling of Hospital-Wide 30-Day Unplanned Readmission. Kidney Diseases 5 153–162. - PMC - PubMed
    1. Efron B (2004). Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis. Journal of the American Statistical Association 99 96–104.

LinkOut - more resources