. 2018 Apr 1;187(4):871-878.

doi: 10.1093/aje/kwx317.

Data-Adaptive Estimation for Double-Robust Methods in Population-Based Cancer Epidemiology: Risk Differences for Lung Cancer Mortality by Emergency Presentation

Miguel Angel Luque-Fernandez¹, Aurélien Belot¹, Linda Valeri^{2

3}, Giovanni Cerulli⁴, Camille Maringe¹, Bernard Rachet¹

Affiliations

¹ Faculty of Epidemiology and Population Health, Department of Non-Communicable Disease Epidemiology, Cancer Survival Group, London School of Hygiene and Tropical Medicine, London, United Kingdom.
² Laboratory for Psychiatric Biostatistics, McLean Hospital, Belmont, Massachusetts.
³ Harvard Medical School, Harvard University, Boston, Massachusetts.
⁴ National Research Council of Italy, Research Institute on Sustainable Economic Growth, Rome, Italy.

PMID: 29020131
PMCID: PMC5888939
DOI: 10.1093/aje/kwx317

Data-Adaptive Estimation for Double-Robust Methods in Population-Based Cancer Epidemiology: Risk Differences for Lung Cancer Mortality by Emergency Presentation

Miguel Angel Luque-Fernandez et al. Am J Epidemiol. 2018.

. 2018 Apr 1;187(4):871-878.

doi: 10.1093/aje/kwx317.

Authors

Miguel Angel Luque-Fernandez¹, Aurélien Belot¹, Linda Valeri^{2

3}, Giovanni Cerulli⁴, Camille Maringe¹, Bernard Rachet¹

Affiliations

¹ Faculty of Epidemiology and Population Health, Department of Non-Communicable Disease Epidemiology, Cancer Survival Group, London School of Hygiene and Tropical Medicine, London, United Kingdom.
² Laboratory for Psychiatric Biostatistics, McLean Hospital, Belmont, Massachusetts.
³ Harvard Medical School, Harvard University, Boston, Massachusetts.
⁴ National Research Council of Italy, Research Institute on Sustainable Economic Growth, Rome, Italy.

PMID: 29020131
PMCID: PMC5888939
DOI: 10.1093/aje/kwx317

Abstract

In this paper, we propose a structural framework for population-based cancer epidemiology and evaluate the performance of double-robust estimators for a binary exposure in cancer mortality. We conduct numerical analyses to study the bias and efficiency of these estimators. Furthermore, we compare 2 different model selection strategies based on 1) Akaike's Information Criterion and the Bayesian Information Criterion and 2) machine learning algorithms, and we illustrate double-robust estimators' performance in a real-world setting. In simulations with correctly specified models and near-positivity violations, all but the naive estimators had relatively good performance. However, the augmented inverse-probability-of-treatment weighting estimator showed the largest relative bias. Under dual model misspecification and near-positivity violations, all double-robust estimators were biased. Nevertheless, the targeted maximum likelihood estimator showed the best bias-variance trade-off, more precise estimates, and appropriate 95% confidence interval coverage, supporting the use of the data-adaptive model selection strategies based on machine learning algorithms. We applied these methods to estimate adjusted 1-year mortality risk differences in 183,426 lung cancer patients diagnosed after admittance to an emergency department versus persons with a nonemergency cancer diagnosis in England (2006-2013). The adjusted mortality risk (for patients diagnosed with lung cancer after admittance to an emergency department) was 16% higher in men and 18% higher in women, suggesting the importance of interventions targeting early detection of lung cancer signs and symptoms.

PubMed Disclaimer

Figures

**Figure 1.**
Directed acyclic graph for a proposed structural causal framework in population-based cancer research. Conditional exchangeability of the treatment effect or exposure (A) on 1-year cancer mortality (Y) is obtained through conditioning on a set of available covariates (Y₁,Y₀ ⊥ A|W). The minimum sufficient set, based on the backdoor criterion, is obtained through conditioning on only W₁, W₃, and W₄. The average treatment effect for the structural framework is estimated as the average risk difference between the expected effect of the treatment conditional on W among treated persons (E(Y|A = 1; W)) and the expected effect of the treatment conditional on W among the untreated (E(Y|A = 0; W)). W₁, socioeconomic status; W₂, age; W₃, cancer stage; W₄, comorbidity.

**Figure 2.**
Overlap of the propensity scores for correctly specified (first scenario (A)) and misspecified (second scenario (B)) models for the probabilities of treatment status P(A = 1|W) and P(A = 0|W) in 1 random sample from 1,000 Monte Carlo simulations.

**Figure 3.**
Sex-specific adjusted risk difference for 1-year lung cancer mortality according to different double-robust estimators among 183,426 lung cancer patients diagnosed after admittance to an emergency department versus persons with a nonemergency cancer diagnosis, England, 2006–2013. A) women; B) men. Bars, 95% confidence intervals. AIPTW, augmented inverse-probability-of-treatment weighting; BF-AIPTW, best-fit augmented inverse-probability-of-treatment weighting (data-adaptive estimation based on Akaike’s Information Criterion (AIC) and the Bayesian Information Criterion (BIC)); BF-IPTW-RA, best-fit inverse-probability-of-treatment-weighted regression adjustment (data-adaptive estimation based on AIC-BIC); IPTW-RA, inverse-probability-of-treatment-weighted regression adjustment; TMLE, targeted maximum likelihood estimation (data-adaptive estimation based on ensemble learning and k-fold cross-validation).

See this image and copyright information in PMC

Cited by

Comparison of Parametric and Nonparametric Estimators for the Association Between Incident Prepregnancy Obesity and Stillbirth in a Population-Based Cohort Study.
Yu YH, Bodnar LM, Brooks MM, Himes KP, Naimi AI. Yu YH, et al. Am J Epidemiol. 2019 Jul 1;188(7):1328-1336. doi: 10.1093/aje/kwz081. Am J Epidemiol. 2019. PMID: 31111944 Free PMC article.
Stacked generalization: an introduction to super learning.
Naimi AI, Balzer LB. Naimi AI, et al. Eur J Epidemiol. 2018 May;33(5):459-464. doi: 10.1007/s10654-018-0390-z. Epub 2018 Apr 10. Eur J Epidemiol. 2018. PMID: 29637384 Free PMC article. Review.
Deep Ensemble Machine Learning Framework for the Estimation of $P M_{2.5}$ Concentrations.
Yu W, Li S, Ye T, Xu R, Song J, Guo Y. Yu W, et al. Environ Health Perspect. 2022 Mar;130(3):37004. doi: 10.1289/EHP9752. Epub 2022 Mar 7. Environ Health Perspect. 2022. PMID: 35254864 Free PMC article.
Association of medical male circumcision and sexually transmitted infections in a population-based study using targeted maximum likelihood estimation.
Amusa L, Zewotir T, North D, Kharsany ABM, Lewis L. Amusa L, et al. BMC Public Health. 2021 Sep 8;21(1):1642. doi: 10.1186/s12889-021-11705-9. BMC Public Health. 2021. PMID: 34496810 Free PMC article.
Using longitudinal targeted maximum likelihood estimation in complex settings with dynamic interventions.
Schomaker M, Luque-Fernandez MA, Leroy V, Davies MA. Schomaker M, et al. Stat Med. 2019 Oct 30;38(24):4888-4911. doi: 10.1002/sim.8340. Epub 2019 Aug 22. Stat Med. 2019. PMID: 31436859 Free PMC article.

See all "Cited by" articles

References

1. Allemani C, Weir HK, Carreira H, et al. . Global surveillance of cancer survival 1995–2009: analysis of individual data for 25,676,887 patients from 279 population-based registries in 67 countries (CONCORD-2). Lancet. 2015;385(9972):977–1010. - PMC - PubMed
1. Parkin DM. The role of cancer registries in cancer control. Int J Clin Oncol. 2008;13(2):102–111. - PubMed
1. Rachet B, Ellis L, Maringe C, et al. . Socioeconomic inequalities in cancer survival in England after the NHS cancer plan. Br J Cancer. 2010;103(4):446–453. - PMC - PubMed
1. Siesling S, Louwman WJ, Kwast A, et al. . Uses of cancer registries for public health and clinical research in Europe: results of the European Network of Cancer Registries survey among 161 population-based cancer registries during 2010–2012. Eur J Cancer. 2015;51(9):1039–1049. - PubMed
1. Andersson K, Bray F, Arbyn M, et al. . The interface of population-based cancer registries and biobanks in etiological and clinical research—current and future perspectives. Acta Oncol. 2010;49(8):1227–1234. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Data-Adaptive Estimation for Double-Robust Methods in Population-Based Cancer Epidemiology: Risk Differences for Lung Cancer Mortality by Emergency Presentation

Affiliations

Data-Adaptive Estimation for Double-Robust Methods in Population-Based Cancer Epidemiology: Risk Differences for Lung Cancer Mortality by Emergency Presentation

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical