Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Jul 22;11(1):3673.
doi: 10.1038/s41467-020-17478-w.

Causality matters in medical imaging

Affiliations
Review

Causality matters in medical imaging

Daniel C Castro et al. Nat Commun. .

Abstract

Causal reasoning can shed new light on the major challenges in machine learning for medical imaging: scarcity of high-quality annotated data and mismatch between the development dataset and the target environment. A causal perspective on these issues allows decisions about data collection, annotation, preprocessing, and learning strategies to be made and scrutinized more transparently, while providing a detailed categorisation of potential biases and mitigation techniques. Along with worked clinical examples, we highlight the importance of establishing the causal relationship between images and their annotations, and offer step-by-step recommendations for future studies.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Key challenges in machine learning for medical imaging.
a Data scarcity and bd data mismatch. X represents images and Y, annotations (e.g. diagnosis labels). Ptr refers to the distribution of data available for training a predictive model, and Pte is the test distribution, i.e. data that will be encountered once the model is deployed. Dots represent data points with any label, while circles and crosses indicate images with different labels (e.g. cases vs. controls).
Fig. 2
Fig. 2. Causal diagrams for medical imaging examples.
a Skin lesion classification. b Prostate tumour segmentation. Filled circular nodes represent measured variables, double circular nodes denote sample selection indicators, and squares are used for sample domain indicators. Here we additionally highlight the direction of the predictive task.
Fig. 3
Fig. 3. Selection diagrams for dataset shift.
ac Causal and df anticausal scenarios, with corresponding factorisations of the joint distribution PD(X, Y, Z). X is the acquired image; Y, the prediction target; Z, the unobserved true anatomy; and D, the domain indicator (0: ‘train’, 1: ‘test’). An unfilled node means the variable is unmeasured.
Fig. 4
Fig. 4. Causal diagrams for different sample selection scenarios.
a Random; b image-dependent; c target-dependent; d jointly dependent. S = 1 indicates an observed sample, and plain edges represent either direction.
Fig. 5
Fig. 5. A ‘scaffold’ causal diagram summarising typical medical imaging workflows.
We believe most practical cases can be adapted from this generic structure by removing or adding elements. Here are represented a variety of possible prediction targets (marked Y1Y4): some anticausal (Y1, Y2) and others, causal (Y3, Y4). ‘Annotation’ here refers to any image-derived data, such as lesion descriptions, regions of interest, spatial landmark coordinates, or segmentation maps. Note that annotators will often be aware of the patients' records and diagnoses, in which cases there could be additional arrows from Y1 or Y2 towards Y4.

References

    1. Esteva A, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–118. doi: 10.1038/nature21056. - DOI - PMC - PubMed
    1. Menze BH, et al. The multimodal brain tumor image segmentation benchmark (BRATS) IEEE Trans. Med. Imag. 2015;34:1993–2024. doi: 10.1109/TMI.2014.2377694. - DOI - PMC - PubMed
    1. Bareinboim E, Pearl J. Causal inference and the data-fusion problem. Proc. Natl Acad. Sci. USA. 2016;113:7345–7352. doi: 10.1073/pnas.1510507113. - DOI - PMC - PubMed
    1. Lucas PJF, van der Gaag LC, Abu-Hanna A. Bayesian networks in biomedicine and health-care. Artif. Intell. Med. 2004;30:201–214. doi: 10.1016/j.artmed.2003.11.001. - DOI - PubMed
    1. Cypko MA, et al. Validation workflow for a clinical Bayesian network model in multidisciplinary decision making in head and neck oncology treatment. Int. J. Computer Assist. Radiol. Surg. 2017;12:1959–1970. doi: 10.1007/s11548-017-1531-7. - DOI - PubMed

Publication types

LinkOut - more resources