Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 1:251:118933.
doi: 10.1016/j.neuroimage.2022.118933. Epub 2022 Feb 3.

FastSurferVINN: Building resolution-independence into deep learning segmentation methods-A solution for HighRes brain MRI

Affiliations

FastSurferVINN: Building resolution-independence into deep learning segmentation methods-A solution for HighRes brain MRI

Leonie Henschel et al. Neuroimage. .

Abstract

Leading neuroimaging studies have pushed 3T MRI acquisition resolutions below 1.0 mm for improved structure definition and morphometry. Yet, only few, time-intensive automated image analysis pipelines have been validated for high-resolution (HiRes) settings. Efficient deep learning approaches, on the other hand, rarely support more than one fixed resolution (usually 1.0 mm). Furthermore, the lack of a standard submillimeter resolution as well as limited availability of diverse HiRes data with sufficient coverage of scanner, age, diseases, or genetic variance poses additional, unsolved challenges for training HiRes networks. Incorporating resolution-independence into deep learning-based segmentation, i.e., the ability to segment images at their native resolution across a range of different voxel sizes, promises to overcome these challenges, yet no such approach currently exists. We now fill this gap by introducing a Voxel-size Independent Neural Network (VINN) for resolution-independent segmentation tasks and present FastSurferVINN, which (i) establishes and implements resolution-independence for deep learning as the first method simultaneously supporting 0.7-1.0 mm whole brain segmentation, (ii) significantly outperforms state-of-the-art methods across resolutions, and (iii) mitigates the data imbalance problem present in HiRes datasets. Overall, internal resolution-independence mutually benefits both HiRes and 1.0 mm MRI segmentation. With our rigorously validated FastSurferVINN we distribute a rapid tool for morphometric neuroimage analysis. The VINN architecture, furthermore, represents an efficient resolution-independent segmentation method for wider application.

Keywords: Artificial intelligence; Computational neuroimaging; Deep learning; High-resolution; Structural MRI.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Resolution-independence in deep learning networks: A. Dedicated fixed-resolution convolutional neural networks (CNNs) only work on the resolution they are trained on and are limited by the availability and quality of corresponding datasets. B. One single resolution-ignorant CNN can learn to segment multiple resolutions by training on a diverse dataset. External scale augmentation (+exSA, B. left) simulates resolutions with few or no training cases by resampling the image and the reference segmentation map. Here, however, lossy interpolation and resulting artefacts, especially from nearest-neighbour interpolation of discrete label maps, may result in a loss of structural details and sub-optimal performance. C. Our voxel size independent neural network (VINN) avoids interpolation of the images and discrete labels by integrating the interpolation step into the network architecture. Further, the explicit transition from the native resolution to a normalized internal resolution facilitates an understanding of the difference between image features (MultiRes blocks with distances measured in voxels) and anatomical features (FixedRes inner blocks with normalized distances).
Fig. 2.
Fig. 2.
Image resolution affects the detail of discrete segmentation label maps and derived measures such as surface models and thickness. A. The low-resolution image is less detailed and causes partial volume effects (PVEs) by accumulating signals across tissue boundaries into larger voxels, whereas B. High-resolution images and derived segmentations allow more precise region delineation and capture details, e.g. for improved shape or thickness analysis (white arrows).
Fig. 3.
Fig. 3.
Voxel size independence in FastSurferVINN. Flexible transitions between resolutions become possible by replacement of (un)pooling with our network-integrated resolution-normalization (green) after the first encoder (pre-IDB) and before the last decoder block (post-CDB). Scale transitions between the other competitive dense blocks (CDB) remain as standard MaxPool and UnPool operations. Each CDB is composed of four sequences of parametric rectified linear unit (PReLU), convolution (Conv) and batch normalization (BN). In the first two encoder blocks ((pre)-IDB), the PReLU is replaced with a BN to normalize the inputs.
Fig. 4.
Fig. 4.
HiRes weight mask generation. Left: The difference between the gray matter label map (blue) and its closure (gray) produces the deep sulci and WM strand mask. Right: The difference between the original (blue) and eroded (gray) brain mask produces the outer gray matter mask.
Fig. 5.
Fig. 5.
Ablative optimization of FastSurferCNN and comparison to FastSurfer-VINN. FastSurferCNN (light green) is optimized through a switch to 3 × 3 kernels (FastSurferCNN*, green). Addition of data augmentation (external scaling augmentation, FastSurferCNN* + exSA, dark green) improves performance further. VINN equipped with internal scaling augmentation (inSA) (FastSurferVINN, orange) outperforms all other models on both subcortical (left) and cortical (right) structures with respect to Dice Similarity Coefficient (DSC, top) and average surface distance (ASD, bottom). Further addition of external scaling augmentation negatively affects performance (VINN + inSA + exSA). Segmentation results with FastSurferVINN are significantly better compared to all other models (corrected p < 10−7).
Fig. 6.
Fig. 6.
Effect of sampling kernels on network-integrated resolution-normalization. Comparison of nearest-neighbour (NN, purple), area (light violet), bi-cubic (light yellow), and bi-linear (orange) sampling kernels with respect to the Dice Similarity Coefficient (DSC, top) and the average surface distance (ASD, bottom) for subcortical (left) and cortical (right) structures. Segmentation performance with NN is significantly worse than all other interpolation strategies (corrected p < 10−13). Area, bi-cubic and bi-linear give equivalent results.
Fig. 7.
Fig. 7.
Adaptation of the original loss function through addition of HiRes weights focusing on areas strongly effected by PVEs (HiRes Loss, right bar) significantly improves segmentation performance on the cortical structures (compared to FastSurferVINN with original loss, left bar). Addition of attention (middle bar) does not lead to a significant improvement compared to the baseline. Dice Similarity Coefficient (DSC, top) and average surface distance (ASD, bottom) are shown for subcortical (left) and cortical (right) structures. Cortical structures are significantly better segmented with the HiRes Loss (corrected p < 10−5). No significant change was detected on the subcortical structures..
Fig. 8.
Fig. 8.
Improved generalization performance of FastSurferVINN across nine datasets. FastSurferVINN (orange) outperforms FastSurferCNN* + external scale augmentation ( + exSA, dark green) across subcortical (left) and cortical structures (right) with respect to Dice Similarity Coefficient (DSC, top) and average surface distance (ASD, bottom). Results are consistently better for all datasets (HCP, RS, ABIDE-II, ABIDE-I, ADNI, IXI, LA5C, OASIS1 and OASIS2).
Fig. 9.
Fig. 9.
Improved generalization performance of FastSurferVINN to resolutions not encountered during network training (here training datasets are cus-tomized, see Sections 4.3.2 and A.3). FastSurferVINN (orange) outperforms FastSurferCNN* equipped with external scale augmentation (+ exSA, dark green) with respect to Dice Similarity Coefficient (DSC, top) and average surface distance (ASD, bottom) across subcortical (left) and cortical structures (right). Results are significantly better across all resolutions (0.7 mm, 0.8 mm, and 0.9 mm, corrected p < 10−4).
Fig. 10.
Fig. 10.
Superior generalization performance of FastSurferVINN to resolutions vastly outside the training domain (1.4 mm, 1.6 mm). FastSurferVINN (orange) outperforms scale-augmentation (FastSurferCNN* + exSA, green) highlighting its extrapolation capabilities. Results are significantly better with respect to Dice Similarity Coefficient (DSC, top) and average surface distance (ASD, bottom) across subcortical (left) and cortical structures (right) (corrected p < 0.001 for 1.4 mm and p < 10−13 for 1.6 mm).
Fig. 11.
Fig. 11.
Performance of FastSurferVINN with respect to manual references. Based on the 1.0 mm scans in Mindboggle101 (left plot) FastSurferVINN (orange) outperforms external scale augmentation (+exSA, dark green) on the cortical structures (right, N = 78) with respect to Dice Similarity Coefficient (DSC, top) and average surface distance (ASD, bottom). Results on the subcortical structures (left side, N = 20) are equivalent for both approaches. Similarly, segmentation results are better for the 0.8 mm scans of the RS (right plot, N = 6) for white matter (WM), gray matter (GM), and hippocampus (Hippo).
Fig. 12.
Fig. 12.
Flexible-versus fixed-resolution networks. FastSurferVINN (orange) is comparable to, or outperforms, all fixed-resolution networks (green) (left plot: 0.8 mm from RS only, right plots: 1.0 mm) with respect to the Dice Similarity Coefficient (DSC, top) and average surface distance (ASD, bottom). On the submillimeter scans (left plots) generalization to an unseen dataset (HCPL) is significantly improved. Results are consistently better for the 1.0 mm scans (right plot). To highlight cumulative VINN and architectural optimizations, we also compare with the state-of-the-art FastSurferCNN (gray, without optimizations from Sections 4.1 and 4.2, which are already included in CNN*). We retrain this 1 mm fixed-resolution network ensuring equal training datasets and conditions.
Fig. 13.
Fig. 13.
Big-FastSurferVINN trained with approximately 20 times more 1.0 mm scans (n = 1315, yellow) than the original version (n = 120, orange) raises segmentation performance across resolutions. Dice Similarity Coefficient (DSC, top) and average surface distance (ASD, bottom) improve on the submillimeter (0.7 mm–0.9 mm) as well as 1.0 mm scans.

References

    1. A mind-brain, 2019. -Body dataset of MRI, EEG, cognition, emotion, and peripheral physiology in young and old adults. Sci. Data 6 (1), 180308. doi: 10.1038/sdata.2018.308. http://www.nature.com/articles/sdata2018308 - DOI - PMC - PubMed
    1. Alao H, Kim J-S, Kim TS, Lee K, 2021. Efficient multi-scalable network for single image super resolution. J. Multimed. Inf. Syst 8 (2), 101–110. doi: 10.33851/JMIS.2021.8.2.101. - DOI
    1. Allebach JP, 2005. 7.1 - image scanning, sampling, and interpolation. In: Bovik A (Ed.). Communications, Networking and Multimedia, Handbook of Image and Video Processing, second ed., Academic Press, Burlington: doi: 10.1016/B978-012119792-6/50115-7. pp. 895–XXVII. https://www.sciencedirect.com/science/article/pii/B9780121197926501157 - DOI
    1. Bazin P-L, Weiss M, Dinse J, Schäfer A, Trampel R, Turner R, 2014. A computational framework for ultra-high resolution cortical segmentation at 7 Tesla. NeuroImage 93, 201–209. doi: 10.1016/j.neuroimage.2013.03.077. - DOI - PubMed
    1. Benjamini Y, Hochberg Y, 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc 57 (1), 289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x. https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2517-6161.1995.tb0... - DOI - DOI

Publication types

MeSH terms