Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May:86:102789.
doi: 10.1016/j.media.2023.102789. Epub 2023 Feb 25.

SynthSeg: Segmentation of brain MRI scans of any contrast and resolution without retraining

Affiliations

SynthSeg: Segmentation of brain MRI scans of any contrast and resolution without retraining

Benjamin Billot et al. Med Image Anal. 2023 May.

Abstract

Despite advances in data augmentation and transfer learning, convolutional neural networks (CNNs) difficultly generalise to unseen domains. When segmenting brain scans, CNNs are highly sensitive to changes in resolution and contrast: even within the same MRI modality, performance can decrease across datasets. Here we introduce SynthSeg, the first segmentation CNN robust against changes in contrast and resolution. SynthSeg is trained with synthetic data sampled from a generative model conditioned on segmentations. Crucially, we adopt a domain randomisation strategy where we fully randomise the contrast and resolution of the synthetic training data. Consequently, SynthSeg can segment real scans from a wide range of target domains without retraining or fine-tuning, which enables straightforward analysis of huge amounts of heterogeneous clinical data. Because SynthSeg only requires segmentations to be trained (no images), it can learn from labels obtained by automated methods on diverse populations (e.g., ageing and diseased), thus achieving robustness to a wide range of morphological variability. We demonstrate SynthSeg on 5,000 scans of six modalities (including CT) and ten resolutions, where it exhibits unparallelled generalisation compared with supervised CNNs, state-of-the-art domain adaptation, and Bayesian segmentation. Finally, we demonstrate the generalisability of SynthSeg by applying it to cardiac MRI and CT scans.

Keywords: CNN; Contrast and resolution invariance; Domain randomisation; Segmentation.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1.
Fig. 1.
(a) Representative samples of the synthetic 3D scans used to train SynthSeg for brain segmentation, and contours of the corresponding ground truth. (b) Test-time segmentations for a variety of contrasts and resolutions, on subjects spanning a wide age range, some presenting large atrophy and white matter lesions (green arrows). All segmentations are obtained with the same network, without retraining or fine-tuning.
Fig. 2.
Fig. 2.
Overview of a training step. At each mini-batch, we randomly select a 3D label map from a training set Sn and sample a pair {I,T} from the generative model. The obtained image is then run through the network, and its prediction Y is used to compute the average soft Dice loss, that is backpropagated to update the weights of the network.
Fig. 3.
Fig. 3.
Intermediate steps of the generative model: (a) we randomly select an input label map from the training set, which we (b) spatially augment in 3D. (c) A first synthetic image is obtained by sampling a GMM at HR with randomised parameters. (d) The result is then corrupted with a bias field and further intensity augmentation. (e) Slice spacing and thickness are simulated by successively blurring and downsampling at random LR. (f) The training inputs are obtained by resampling the image to HR, and removing the labels we do not wish to segment (e.g., extra-cerebral regions).
Fig. 4.
Fig. 4.
Box plots showing Dice scores obtained by all methods for every dataset. For each box, the central mark is the median; edges are the first and third quartiles; and outliers are marked with ⧫.
Fig. 5.
Fig. 5.
Representative features of the last layer of the network for two scans of different contrast and resolution for the same subject. While the T1 baseline only produces noise outside its training domain, SynthSeg learns a consistent representation across contrasts and resolutions.
Fig. 6.
Fig. 6.
Sample segmentations from the first experiment. Major segmentation mistakes are indicated with yellow arrows. SynthSeg produces very accurate segmentations for all contrasts and resolutions. The T1 baseline makes small errors outside its training domain and cannot be applied to other modalities. While the TTA approach yields very good segmentations for T1mix, its results degrade for larger domain gaps, where it is outperformed by SIFA. Finally, SAMSEG yields coherent results for scans at 1 mm resolution, but is heavily affected by PV effects at low resolution.
Fig. 7.
Fig. 7.
Dice scores for data downsampled at 3, 5, or 7 mm in either axial, coronal, or sagittal direction (results are averaged across directions).
Fig. 8.
Fig. 8.
Examples of segmentations obtained by SynthSeg for two scans artificially downsampled at decreasing LR. SynthSeg presents an impressive generalisation ability to all resolutions, despite heavy PV effects and important loss of information at LR. However, we observe a slight decrease in accuracy for thin and convoluted structures such as the cerebral cortex (red) or the white cerebellar matter (dark yellow).
Fig. 9.
Fig. 9.
Mean Dice scores obtained for SynthSeg and ablated variants.
Fig. 10.
Fig. 10.
Dice vs. number of training label maps for SynthSeg (circles) on representative datasets. The last points are obtained by training on all available labels maps (20 manual plus 1000 automated). We also report scores obtained on T1mix by the T1 baseline (triangles).
Fig. 11.
Fig. 11.
Close-up on the hippocampus for an ADNI testing subject with atrophy patterns that are not present in the manual training segmentations. Hence, training SynthSeg on these manual maps only leads to limited accuracy (red arrows). However, adding a large number of automated maps from different populations to the training set enables us to improve robustness against morphological variability (green arrow).
Fig. 12.
Fig. 12.
Representative cardiac segmentations obtained by SynthSeg on three datasets, without retraining on any of them, and without using real images during training. LASC13 only has ground truth for LA (pink).

References

    1. Abadi M, Barham P, Chen J, Chen Z, Davis A, 2016. Tensorflow: A system for large-scale machine learning. In: Symposium on Operating Systems Design and Implementation. pp. 265–283.
    1. Arsigny V, Commowick O, Pennec X, Ayache N, 2006. A log-Euclidean framework for statistics on diffeomorphisms. In: Medical Image Computing and Computer Assisted Intervention. pp. 924–931. - PubMed
    1. Ashburner J, Friston K, 2005. Unified segmentation. NeuroImage 26 (3), 839–851. - PubMed
    1. Bengio Y, Bastien F, Bergeron A, Boulanger-Lewandowski N, Breuel T, Chherawala Y, Cisse M, et al., 2011. Deep learners benefit more from out-of-distribution examples. In: International Conference on Artificial Intelligence and Statistics. pp. 164–172.
    1. Billot B, Cerri S, Van Leemput K, Dalca A, Iglesias JE, 2021. Joint segmentation of multiple sclerosis lesions and brain anatomy in MRI scans of any contrast and resolution with CNNs. In: IEEE International Symposium on Biomedical Imaging. pp. 1971–1974. - PMC - PubMed

Publication types