Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov;17(11):e70077.
doi: 10.1111/cts.70077.

Integrating real-world data and machine learning: A framework to assess covariate importance in real-world use of alternative intravenous dosing regimens for atezolizumab

Affiliations

Integrating real-world data and machine learning: A framework to assess covariate importance in real-world use of alternative intravenous dosing regimens for atezolizumab

Bianca Vora et al. Clin Transl Sci. 2024 Nov.

Abstract

The increase in the availability of real-world data (RWD), in combination with advances in machine learning (ML) methods, provides a unique opportunity for the integration of the two to explore complex clinical pharmacology questions. Here we present a recently developed RWD/ML framework that utilizes ML algorithms to understand the influence and importance of various covariates on the use of a given dose and schedule for drugs that have multiple approved dosing regimens. To demonstrate the application of this framework, we present atezolizumab as a use case on account of its three approved alternative intravenous (IV) dosing regimens. As expected, the real-world use of atezolizumab has generally been increasing since 2016 for the 1200 mg every 3 weeks regimen and since 2019 for the 1680 mg every 4 weeks regimen. Out of the ML algorithms evaluated, XGBoost performed the best, as measured by the area under the precision-recall curve, with an emphasis on the under-sampled class given the imbalance in the data. The importance of features was measured by Shapley Additive exPlanations (SHAP) values and showed metastatic breast cancer and use of protein-bound paclitaxel as the most correlated with the use of 840 mg every 2 weeks. Although patient usage data for alternative IV dosing regimens are still maturing, these analyses provide initial insights on the use of atezolizumab and set up a framework for the re-analysis of atezolizumab (at a future data cut) as well as application to other molecules with approved alternative dosing regimens.

PubMed Disclaimer

Conflict of interest statement

All authors are employees and stockholders of Roche/Genentech, Inc.

Figures

FIGURE 1
FIGURE 1
Machine learning workflow to assess alternative dosing patterns using RWD. Selected cohort is split into training data (used for model development) and test data (used for model assessment). Based on predicted class distribution, the model is evaluated for performance. Shapley Additive exPlanations (SHAP) values are calculated to quantify feature contribution to RWD‐based outcomes. RWD, real‐world data; ML, machine learning; SES, socioeconomic status. Created in BioRender. Velasquez, E. (2024) https://biorender.com/o63s907.
FIGURE 2
FIGURE 2
Overview of data cleaning and cohort selection workflow. Compliance score is calculated as (number of intervals at most commonly used dosing regimen)/(total number of intervals). Data cutoff date: February 2024.
FIGURE 3
FIGURE 3
Use of different atezolizumab intravenous dosing regimens over time shown (a) without 1200 mg Q3W and (b) with 1200 mg Q3W. Data from 2024 not shown in plots given early data cutoff (February 2024).
FIGURE 4
FIGURE 4
Precision–Recall (PR) curves for three binary classifiers. 840 mg Q2W is the positive class, and 1200 mg Q3W is the negative class. Models are randomized into training and test datasets with a 70/30 ratio, respectively. PR curves were generated using the test dataset and only PR curves for single bootstrap are shown. XGBoost has the highest AP of the three models and is used for the rest of our assessments, with a mean [90% confidence interval] AP Score of 0.9956 [0.9936, 0.9971] as compared to 0.9192 [0.9069, 0.9298] for Random Forest and 0.9228 [0.9059, 0.9381] for CatBoost. RF, Random Forest; XG, XGBoost; Cat, CatBoost; AP, average precision.
FIGURE 5
FIGURE 5
SHAP value summaries for XGBoost model. (a) Beeswarm plot of features ranked highest to lowest from average magnitude of importance. Example SHAP values (portrayed using waterfall plots) for one patient belonging to the (b) negative class (1200 mg Q3W) and another patient belonging to the (c) positive class (840 mg Q2W). Waterfall plots summarize SHAP contributions, starting from the average prediction (E[f(x)], at the bottom of the plot) and ending at the individual prediction (f(x), at the top of the plot); features are listed as most to least important (top to bottom) and include their respective magnitude/directionality. f(x) closer to 0 represents an individual prediction of 840 mg Q2W, while f(x) closer to 1 represents an individual prediction of 1200 mg Q3W. Only the top 20 features are shown in (a).

Similar articles

References

    1. Kim HS, Lee S, Kim JH. Real‐world evidence versus randomized controlled trial: clinical research based on electronic medical records. J Korean Med Sci. 2018;33(34):e213. doi:10.3346/jkms.2018.33.e213 - DOI - PMC - PubMed
    1. Zhao X, Iqbal S, Valdes IL, Dresser M, Girish S. Integrating real‐world data to accelerate and guide drug development: a clinical pharmacology perspective. Clin Transl Sci. 2022;15(10):2293‐2302. doi:10.1111/cts.13379 - DOI - PMC - PubMed
    1. Zhu R, Vora B, Menon S, et al. Clinical pharmacology applications of real‐world data and real‐world evidence in drug development and approval‐an industry perspective. Clin Pharmacol Ther. 2023;114(4):751‐767. doi:10.1002/cpt.2988 - DOI - PubMed
    1. McCafferty J, Grover K, Li L, et al. A systematic analysis of off‐label drug use in real‐world data (RWD) across more than 145,000 cancer patients. J Clin Oncol. 2019;37(15_suppl):e18031. doi:10.1200/JCO.2019.37.15_suppl.e18031 - DOI
    1. Uncovering the patient journey: Four ways that real‐world data offers value. STAT. Published September 11, 2023. Accessed June 5, 2024. https://www.statnews.com/sponsor/2023/09/05/uncovering‐the‐patient‐journ...

Substances

LinkOut - more resources