Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 24;22(1):247.
doi: 10.1186/s12874-022-01720-8.

Evaluating sensitivity to classification uncertainty in latent subgroup effect analyses

Affiliations

Evaluating sensitivity to classification uncertainty in latent subgroup effect analyses

Wen Wei Loh et al. BMC Med Res Methodol. .

Abstract

Background: Increasing attention is being given to assessing treatment effect heterogeneity among individuals belonging to qualitatively different latent subgroups. Inference routinely proceeds by first partitioning the individuals into subgroups, then estimating the subgroup-specific average treatment effects. However, because the subgroups are only latently associated with the observed variables, the actual individual subgroup memberships are rarely known with certainty in practice and thus have to be imputed. Ignoring the uncertainty in the imputed memberships precludes misclassification errors, potentially leading to biased results and incorrect conclusions.

Methods: We propose a strategy for assessing the sensitivity of inference to classification uncertainty when using such classify-analyze approaches for subgroup effect analyses. We exploit each individual's typically nonzero predictive or posterior subgroup membership probabilities to gauge the stability of the resultant subgroup-specific average causal effects estimates over different, carefully selected subsets of the individuals. Because the membership probabilities are subject to sampling variability, we propose Monte Carlo confidence intervals that explicitly acknowledge the imprecision in the estimated subgroup memberships via perturbations using a parametric bootstrap. The proposal is widely applicable and avoids stringent causal or structural assumptions that existing bias-adjustment or bias-correction methods rely on.

Results: Using two different publicly available real-world datasets, we illustrate how the proposed strategy supplements existing latent subgroup effect analyses to shed light on the potential impact of classification uncertainty on inference. First, individuals are partitioned into latent subgroups based on their medical and health history. Then within each fixed latent subgroup, the average treatment effect is assessed using an augmented inverse propensity score weighted estimator. Finally, utilizing the proposed sensitivity analysis reveals different subgroup-specific effects that are mostly insensitive to potential misclassification.

Conclusions: Our proposed sensitivity analysis is straightforward to implement, provides both graphical and numerical summaries, and readily permits assessing the sensitivity of any machine learning-based causal effect estimator to classification uncertainty. We recommend making such sensitivity analyses more routine in latent subgroup effect analyses.

Keywords: Causal inference; Finite mixture models; Latent class analysis; Parametric bootstrap; Perturbed confidence interval; Sensitivity analysis; Subgroup average treatment effect (ATE).

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Causal diagrams depicting different causal (or structural) relationships between the indicators of the latent class C, and the “external” observed variables such as treatment (Z) and outcome (Y). In the left column (subfigures a and c), no covariates are predictors of the latent class, so (X1,,Xp) are indicators. In the right column (subfigures b and d), a subset of the covariates (X1,,Xq) are (explanatory) predictors of the latent class and its indicators (Xq+1,,Xp). In the top row (subfigures a and b), the indicators are conditionally independent of all external variables given the latent class, as represented by the absence of arrows linking the indicators with any other observed variables. In the bottom row (subfigures c and d), the indicators are permitted to affect, or be affected by, external observed variables, as represented by the red arrows. Rectangular nodes denote observed variables, while round nodes denote latent variables
Fig. 2
Fig. 2
Examples of plots for gauging the (in)stability of the subgroup-specific (average) treatment effect estimates. The plots are for three different subgroups from different datasets. In the top panel, the cumulative proportion of individuals (horizontal axis) whose subgroup membership probabilities are above a certain value (on the vertical axis) are plotted. The vertical broken line indicates the proportion of individuals imputed to that subgroup. In the bottom panel, the treatment effect estimates as individuals are added one at a time to that subgroup are plotted. The empty circles, and error bars, indicate subgroup-specific effect estimates, and 95% CIs, respectively, based on the imputed subgroup memberships
Fig. 3
Fig. 3
Plots for assessing the stability of the class-specific (average) treatment effect estimates in the lindner data. Details on how to interpret each plot are described in Section 3.2 and in the caption of Fig. 2. The endpoints of the perturbed CI for the effect within each class are plotted as horizontal dotted lines. The upper endpoint of the CI for class 2 is much larger than that of the other class and thus omitted to improve visualization. The sample average treatment effect is plotted as a horizontal solid gray line
Fig. 4
Fig. 4
Plots for assessing the stability of the class-specific (average) treatment effect estimates in the RHC data. Details on how to interpret each plot are described in Section 3.2 and in the caption of Fig. 2. The endpoints of the perturbed CI for the effect within each class are plotted as horizontal dotted lines. The CI for class 3 is much wider than those of the other classes and thus omitted to improve visualization. The sample average treatment effect is plotted as a horizontal solid gray line

References

    1. Ferreira JP, Duarte K, McMurray JJV, Pitt B, van Veldhuisen DJ, Vincent J, Ahmad T, Tromp J, Rossignol P, Zannad F. Data-driven approach to identify subgroups of heart failure with reduced ejection fraction patients with different prognoses and aldosterone antagonist response patterns. Circ Heart Fail. 2018;11(7):004926. - PubMed
    1. Kim HJ, Lu B, Nehus EJ, Kim M-O. Estimating heterogeneous treatment effects for latent subgroups in observational studies. Stat Med. 2019;38(3):339–53. - PubMed
    1. Nielsen AM, Hestbaek L, Vach W, Kent P, Kongsted A. Latent class analysis derived subgroups of low back pain patients -do they have prognostic capacity? BMC Musculoskelet Disord. 2017;18(1):345. - PMC - PubMed
    1. Nielsen AM, Kent P, Hestbaek L, Vach W, Kongsted A. Identifying subgroups of patients using latent class analysis: should we use a single-stage or a two-stage approach? a methodological study using a cohort of patients with low back pain. BMC Musculoskelet Disord. 2017;18(1):57. - PMC - PubMed
    1. de Ruigh EL, Bouwmeester S, Popma A, Vermeiren RRJM, van Domburgh L, Jansen LMC. Using the biopsychosocial model for identifying subgroups of detained juveniles at different risk of re-offending in practice: a latent class regression analysis approach. Child Adolesc Psychiatry Ment Health. 2021;15(1):33. - PMC - PubMed

Publication types