Evaluating the Effectiveness of Personalized Medicine With Software

Adam Kapelner¹, Justin Bleich², Alina Levine¹, Zachary D Cohen³, Robert J DeRubeis³, Richard Berk²

Affiliations

¹ Department of Mathematics, Queens College, CUNY, Queens, NY, United States.
² Department of Statistics, The Wharton School of the University of Pennsylvania, Philadelphia, PA, United States.
³ Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States.

PMID: 34085036
PMCID: PMC8167073
DOI: 10.3389/fdata.2021.572532

Evaluating the Effectiveness of Personalized Medicine With Software

Adam Kapelner et al. Front Big Data. 2021.

. 2021 May 18:4:572532.

doi: 10.3389/fdata.2021.572532. eCollection 2021.

Authors

Adam Kapelner¹, Justin Bleich², Alina Levine¹, Zachary D Cohen³, Robert J DeRubeis³, Richard Berk²

Affiliations

¹ Department of Mathematics, Queens College, CUNY, Queens, NY, United States.
² Department of Statistics, The Wharton School of the University of Pennsylvania, Philadelphia, PA, United States.
³ Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States.

PMID: 34085036
PMCID: PMC8167073
DOI: 10.3389/fdata.2021.572532

Abstract

We present methodological advances in understanding the effectiveness of personalized medicine models and supply easy-to-use open-source software. Personalized medicine involves the systematic use of individual patient characteristics to determine which treatment option is most likely to result in a better average outcome for the patient. Why is personalized medicine not done more in practice? One of many reasons is because practitioners do not have any easy way to holistically evaluate whether their personalization procedure does better than the standard of care, termed improvement. Our software, "Personalized Treatment Evaluator" (the R package PTE), provides inference for improvement out-of-sample in many clinical scenarios. We also extend current methodology by allowing evaluation of improvement in the case where the endpoint is binary or survival. In the software, the practitioner inputs 1) data from a single-stage randomized trial with one continuous, incidence or survival endpoint and 2) an educated guess of a functional form of a model for the endpoint constructed from domain knowledge. The bootstrap is then employed on data unseen during model fitting to provide confidence intervals for the improvement for the average future patient (assuming future patients are similar to the patients in the trial). One may also test against a null scenario where the hypothesized personalization are not more useful than a standard of care. We demonstrate our method's promise on simulated data as well as on data from a randomized comparative trial investigating two treatments for depression.

Keywords: bootstrap; inference; personalized medicine; randomized comparative trial; statistical software; treatment regimes.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**FIGURE 1**
A graphical illustration of (1) our proposed method for estimation and (2) our proposed method for inference on the population mean improvement of an allocation procedure and (3) our proposed future allocation procedure (top left of the illustration). To compute the best estimate of the improvement ${\hat{I}}_{0}$ , the RCT data goes through the K-fold cross validation procedure of Section 3.4 (depicted in the top center). The black slices of the data frame represent the test data. To draw inference, we employ the non-parametric bootstrap procedure of Section 3.5 by sampling the RCT data with replacement and repeating the K-fold CV to produce ${\hat{I}}_{0}^{1}, {\hat{I}}_{0}^{2}, \dots, {\hat{I}}_{0}^{B}$ (bottom). The gray slices of the data frame represent the duplicate rows in the original data due to sampling with replacement. The confidence interval and significance of $H_{0} : μ_{I_{0}} \leq 0$ is computed from the bootstrap distribution (middle center). Finally, the practitioner receives $\hat{f}$ which is built with the complete RCT data (top left).

**FIGURE 2**
Histograms of the bootstrap samples of the out-of-sample improvement measures for $d_{0}$ **random** (left column) and $d_{0}$ **best** (right column) for the response model of Eq. 11 for different values of n. ${\hat{I}}_{0}$ is illustrated with a thick black line. The ${CI}_{μ_{I_{0}}, 95 %}$ computed by the percentile method is illustrated by thin black lines.

**FIGURE 3**
Histograms of the bootstrap samples of the cross-validated improvement measures for $d_{0}$ **random** (left column) and $d_{0}$ **best** (right column) for the response model of Eq. 12 for different values of n. ${\hat{I}}_{0}$ is illustrated with a thick black line. The ${CI}_{μ_{I_{0}}, 95 %}$ computed via the percentile method is illustrated by thin black lines. The true population improvement $μ_{I_{0}}^{*}$ given the optimal rule $d^{*}$ is illustrated with a dotted black line.

**FIGURE 4**
Histograms of the bootstrap samples of ${\tilde{I}}_{Rand}$ i.e. for the random $d_{0}$ business-as-usual allocation procedure. The thick black line is the best estimate of ${\hat{I}}_{0}$ , the thin black lines are the confidence interval computed via the percentile method. More negative values are “better” as improvement is defined as lowering the HSRD composite score corresponding to a patient being less depressed.

See this image and copyright information in PMC

Cited by

Mental health care for older adults: recent advances and new directions in clinical practice and research.
Reynolds CF 3rd, Jeste DV, Sachdev PS, Blazer DG. Reynolds CF 3rd, et al. World Psychiatry. 2022 Oct;21(3):336-363. doi: 10.1002/wps.20996. World Psychiatry. 2022. PMID: 36073714 Free PMC article.
Applying methods for personalized medicine to the treatment of alcohol use disorder.
Kuhlemeier A, Desai Y, Tonigan A, Witkiewitz K, Jaki T, Hsiao YY, Chang C, Van Horn ML. Kuhlemeier A, et al. J Consult Clin Psychol. 2021 Apr;89(4):288-300. doi: 10.1037/ccp0000634. J Consult Clin Psychol. 2021. PMID: 34014691 Free PMC article. Clinical Trial.
Mathematical modeling and mechanisms of HIV latency for personalized anti latency therapies.
Rasi G, Emili E, Conway JM, Cotugno N, Palma P. Rasi G, et al. NPJ Syst Biol Appl. 2025 Jun 12;11(1):64. doi: 10.1038/s41540-025-00538-6. NPJ Syst Biol Appl. 2025. PMID: 40506472 Free PMC article. Review.
Measuring the Performance of Survival Models to Personalize Treatment Choices.
Efthimiou O, Hoogland J, Debray TPA, Aponte Ribero V, Knol W, Koek HL, Schwenkglenks M, Henrard S, Egger M, Rodondi N, White IR. Efthimiou O, et al. Stat Med. 2025 Mar 30;44(7):e70050. doi: 10.1002/sim.70050. Stat Med. 2025. PMID: 40207416 Free PMC article.
Development of the treatment prediction model in the artificial intelligence in depression - medication enhancement study.
Benrimoh D, Armstrong C, Mehltretter J, Fratila R, Perlman K, Israel S, Kapelner A, Parikh SV, Karp JF, Heller K, Turecki G. Benrimoh D, et al. Npj Ment Health Res. 2025 Jun 23;4(1):26. doi: 10.1038/s44184-025-00136-8. Npj Ment Health Res. 2025. PMID: 40550942 Free PMC article.

See all "Cited by" articles

References

1. Agresti A. (2018). An Introduction to Categorical Data Analysis. Hoboken, NJ: John Wiley & Sons.
1. Bagby R. M., Quilty L. C., Segal Z. V., McBride C. C., Kennedy S. H., Costa P. T. (2008). Personality and Differential Treatment Response in Major Depression: a Randomized Controlled Trial Comparing Cognitive-Behavioural Therapy and Pharmacotherapy. Can. J. Psychiatry 53, 361–370. 10.1177/070674370805300605 - DOI - PMC - PubMed
1. Barrett J. K., Henderson R., Rosthøj S. (2014). Doubly Robust Estimation of Optimal Dynamic Treatment Regimes. Stat. Biosci. 6, 244–260. 10.1007/s12561-013-9097-6 - DOI - PMC - PubMed
1. Berger J. O., Wang X., Shen L. (2014). A Bayesian Approach to Subgroup Identification. J. Biopharm. Stat. 24, 110–129. 10.1080/10543406.2013.856026 - DOI - PubMed
1. Berk R. A., Brown L., Buja A., Zhang K., Zhao L. (2013b). Valid Post-selection Inference. Ann. Stat. 41, 802–837. 10.1214/12-aos1077 - DOI

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Evaluating the Effectiveness of Personalized Medicine With Software

Affiliations

Evaluating the Effectiveness of Personalized Medicine With Software

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources