Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun;16(3):71.
doi: 10.1145/3718097. Epub 2025 Jun 10.

SubgroupTE: Advancing Treatment Effect Estimation with Subgroup Identification

Affiliations

SubgroupTE: Advancing Treatment Effect Estimation with Subgroup Identification

Seungyeon Lee et al. ACM Trans Intell Syst Technol. 2025 Jun.

Abstract

Precise estimation of treatment effects is crucial for accurately evaluating the intervention. While deep learning models have exhibited promising performance in learning counterfactual representations for treatment effect estimation (TEE), a major limitation in most of these models is that they often overlook the diversity of treatment effects across potential subgroups that have varying treatment effects and characteristics, treating the entire population as a homogeneous group. This limitation restricts the ability to precisely estimate treatment effects and provide targeted treatment recommendations. In this paper, we propose a novel treatment effect estimation model, named SubgroupTE, which incorporates subgroup identification in TEE. SubgroupTE identifies heterogeneous subgroups with different responses and more precisely estimates treatment effects by considering subgroup-specific treatment effects in the estimation process. In addition, we introduce an expectation-maximization (EM)-based training process that iteratively optimizes estimation and subgrouping networks to improve both estimation and subgroup identification. Comprehensive experiments on the synthetic and semi-synthetic datasets demonstrate the outstanding performance of SubgroupTE compared to the existing works for treatment effect estimation and subgrouping models. Additionally, a real-world study demonstrates the capabilities of SubgroupTE in enhancing targeted treatment recommendations for patients with opioid use disorder (OUD) by incorporating subgroup identification with treatment effect estimation.

Keywords: Deep learning; Opioid use disorder; Subgroup analysis; Treatment effect estimation.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
An overview of SubgroupTE. The proposed framework consists of three modules: 1) feature representation network, 2) subgrouping network, and 3) subgroup-informed prediction network. First, the feature representation network transforms the raw data into a meaningful representation for estimating treatment effects. Second, the subgrouping network assigns a subgroup probability vector to each data sample, aiming to maximize the homogeneity of estimated treatment effects within subgroups. Finally, the subgroup-informed prediction network integrates both subgroup information and patient representation to estimate treatment effects by subgroup.
Fig. 2.
Fig. 2.
The boxplots of the treatment effect distribution for the identified subgroups on the (a) synthetic and (b) semi-synthetic datasets. The box spans from the first quartile to the third quartile of the data, with a line indicating the median. The whiskers extend from the box to encompass the 5th to 95th percentiles.
Fig. 3.
Fig. 3.
Illustration of the trends in PEHE and variance within and across subgroups during the training phase.
Fig. 4.
Fig. 4.
Sensitivity analysis conducted for (a) Coefficient and (b) Number of subgroups on the semi-synthetic dataset.
Fig. 5.
Fig. 5.
Illustration of the cohort selection criteria. The index date indicates the first prescription date of the drug. The baseline and follow-up periods encompass all dates before and after the index date, respectively.
Fig. 6.
Fig. 6.
The boxplots of the treatment effect distribution for the identified subgroups on the opioid dataset.
Fig. 7.
Fig. 7.
The heatmap of the relative ratios of variables for demographics and diagnosis codes among the three subgroups. These relative ratios are calculated using the formula πk,i/k=1Kπk,i, where πk,i represents the ratio of the i-th variable within the k-th subgroup. SMD: Substance-related mental disorders; NSD: Other nervous system disorders; SDB: Spondylosis, intervertebral disc disorders, or other back problems; CTB: Other connective tissue disease.

Similar articles

References

    1. 2020. MarketScan Research Databases. https://www.ibm.com/products/marketscan-research-databases
    1. Alaa Ahmed M and Van Der Schaar Mihaela. 2017. Bayesian inference of individualized treatment effects using multi-task gaussian processes. Advances in neural information processing systems 30 (2017).
    1. Argaw Peniel N, Healey Elizabeth, and Kohane Isaac S. 2022. Identifying Heterogeneous Treatment Effects in Multiple Outcomes using Joint Confidence Intervals. In Machine Learning for Health. PMLR, 141–170.
    1. Arthur David and Vassilvitskii Sergei. 2007. K-means++ the advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. 1027–1035.
    1. Ballarini Nicolás M, Rosenkranz Gerd K, Jaki Thomas, König Franz, and Posch Martin. 2018. Subgroup identification in clinical trials via the predicted individual treatment effect. PloS one 13, 10 (2018), e0205971. - PMC - PubMed

LinkOut - more resources