Causal machine learning for healthcare and precision medicine

Pedro Sanchez¹, Jeremy P Voisey², Tian Xia¹, Hannah I Watson², Alison Q O'Neil^{1

2}, Sotirios A Tsaftaris¹

Affiliations

¹ School of Engineering, University of Edinburgh, Edinburgh, UK.
² AI Research, Canon Medical Research Europe, Edinburgh, Lothian, UK.

PMID: 35950198
PMCID: PMC9346354
DOI: 10.1098/rsos.220638

Review

Causal machine learning for healthcare and precision medicine

Pedro Sanchez et al. R Soc Open Sci. 2022.

. 2022 Aug 3;9(8):220638.

doi: 10.1098/rsos.220638. eCollection 2022 Aug.

Authors

Pedro Sanchez¹, Jeremy P Voisey², Tian Xia¹, Hannah I Watson², Alison Q O'Neil^{1

2}, Sotirios A Tsaftaris¹

Affiliations

¹ School of Engineering, University of Edinburgh, Edinburgh, UK.
² AI Research, Canon Medical Research Europe, Edinburgh, Lothian, UK.

PMID: 35950198
PMCID: PMC9346354
DOI: 10.1098/rsos.220638

Abstract

Causal machine learning (CML) has experienced increasing popularity in healthcare. Beyond the inherent capabilities of adding domain knowledge into learning systems, CML provides a complete toolset for investigating how a system would react to an intervention (e.g. outcome given a treatment). Quantifying effects of interventions allows actionable decisions to be made while maintaining robustness in the presence of confounders. Here, we explore how causal inference can be incorporated into different aspects of clinical decision support systems by using recent advances in machine learning. Throughout this paper, we use Alzheimer's disease to create examples for illustrating how CML can be advantageous in clinical scenarios. Furthermore, we discuss important challenges present in healthcare applications such as processing high-dimensional and unstructured data, generalization to out-of-distribution samples and temporal relationships, that despite the great effort from the research community remain to be solved. Finally, we review lines of research within causal representation learning, causal discovery and causal reasoning which offer the potential towards addressing the aforementioned challenges.

Keywords: causal machine learning; causal representation learning; precision medicine.

PubMed Disclaimer

Conflict of interest statement

We declare that we have no competing interests.

Figures

**Figure 1.**
CML in healthcare helps understanding biases and formalizing reasoning about the effect of interventions. We illustrated, with a hypothetical example, that high-level features (causal *representations*) can be extracted from low-level data (e.g. I₁ might correspond to the brain volume derived from a medical image) into a graph corresponding to the data generation process. CML can be used to *discover* which relationships between variables are spurious and which are causal, illustrated with lines dashed and solid lines respectively. Finally, CML offers tools for *reasoning* about the effect of interventions (shown with the do() operator). For instance, an intervention on D₁ would only affect the downstream variables in the graph while other relationships are either not relevant (due to graph mutilation) or remain unchanged.

**Figure 2.**
Causal graph (left) and illustration of how the brain changes in MR images in response to interventions on ‘Age’ or ‘Alzheimer’s disease status’. The images are axial slices of a brain MR scan. The middle image used as a baseline is from a patient aged 64 years old who is classified an cognitively normal (CN) within the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. All other images are synthesized with a conditional generative model [39]. The images with grey background are difference images obtained by subtracting the synthesized image from the baseline. The upper sequence of images is generated by fixing Alzheimer’s status at CN and increasing age by 3 years. The bottom images are generated by fixing the age at 64 and increasing Alzheimer’s status to MCI and AD, as discussed in the main text.

**Figure 3.**
We illustrate the difference between individualized and average treatment effect (ITE versus ATE). ‘Feature’ represents patient characteristics, which would be multi-dimensional in reality. ‘Outcome’ is some measure of response to the treatment, where a more positive value is preferable. The ITE for each patient is the difference between actual and the counterfactual outcome. We show an example counterfactual to highlight that ITE for some patients might differ from the average (ATE). By employing causal inference methods to estimate individualized treatment effects, we can understand which patients benefit from certain medication and which patients do not, thus enabling us to make personalized treatment recommendations. Note that the patient data points are evenly distributed along the feature axis, which would indicate that this data comes from an RCT (due to lack of bias). The estimation of treatment affect using observational data is subject to confounding as patient characteristics affect both the selection of treatment and outcome. Causal inference methods need to mitigate this.

**Figure 4.**
Reasoning about generalization of a prediction task with a causal graph. Anti-causal prediction and a spurious association that may lead to shortcut learning are illustrated.

See this image and copyright information in PMC

References

1. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, Sánchez CI. 2017. A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60-88. ( 10.1016/j.media.2017.07.005) - DOI - PubMed
1. Bica I, Alaa AM, Lambert C, Schaar M. 2021. From real-world patient data to individualized treatment effects using machine learning: current and future methods to address underlying challenges. Clin. Pharmacol. Ther. 109, 87-100. ( 10.1002/cpt.1907) - DOI - PubMed
1. Holland PW. 1986. Statistics and causal inference. J. Am. Stat. Assoc. 81, 945-960. ( 10.1080/01621459.1986.10478354) - DOI
1. Pearl J. 2009. Causality. Cambridge, UK: Cambridge University Press
1. Imbens GW, Rubin DB. 2015. Causal inference for statistics, social, and biomedical sciences: an introduction. Cambridge, UK: Cambridge University Press.

Publication types

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Causal machine learning for healthcare and precision medicine

Affiliations

Causal machine learning for healthcare and precision medicine

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources