Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr 1:169:407-418.
doi: 10.1016/j.neuroimage.2017.12.059. Epub 2017 Dec 24.

Quantitative assessment of structural image quality

Affiliations

Quantitative assessment of structural image quality

Adon F G Rosen et al. Neuroimage. .

Abstract

Data quality is increasingly recognized as one of the most important confounding factors in brain imaging research. It is particularly important for studies of brain development, where age is systematically related to in-scanner motion and data quality. Prior work has demonstrated that in-scanner head motion biases estimates of structural neuroimaging measures. However, objective measures of data quality are not available for most structural brain images. Here we sought to identify quantitative measures of data quality for T1-weighted volumes, describe how these measures relate to cortical thickness, and delineate how this in turn may bias inference regarding associations with age in youth. Three highly-trained raters provided manual ratings of 1840 raw T1-weighted volumes. These images included a training set of 1065 images from Philadelphia Neurodevelopmental Cohort (PNC), a test set of 533 images from the PNC, as well as an external test set of 242 adults acquired on a different scanner. Manual ratings were compared to automated quality measures provided by the Preprocessed Connectomes Project's Quality Assurance Protocol (QAP), as well as FreeSurfer's Euler number, which summarizes the topological complexity of the reconstructed cortical surface. Results revealed that the Euler number was consistently correlated with manual ratings across samples. Furthermore, the Euler number could be used to identify images scored "unusable" by human raters with a high degree of accuracy (AUC: 0.98-0.99), and out-performed proxy measures from functional timeseries acquired in the same scanning session. The Euler number also was significantly related to cortical thickness in a regionally heterogeneous pattern that was consistent across datasets and replicated prior results. Finally, data quality both inflated and obscured associations with age during adolescence. Taken together, these results indicate that reliable measures of data quality can be automatically derived from T1-weighted volumes, and that failing to control for data quality can systematically bias the results of studies of brain maturation.

Keywords: Artifact; Development; MRI; Motion; Structural imaging; T1.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Training protocol for manual raters
There were 4 phases of training. Phase 1: 5 neuroimaging experts reviewed 20 PNC images selected to have various levels of artifact. These images were used to establish rating anchors, which were then used for Phase 2. Phase 2: Two experts (TDS & DRR) rated 100 images. 100% concordance was achieved through consensus. Phase 3: Three new raters were trained on the 100 images used in Phase 2, until the raters achieved 85% concordance after two rounds. Phase 4: All 3 trained raters manually rated 1,840 images across the PNC and the external test dataset (see Table 1).
Figure 2
Figure 2. Results of manual ratings
A-C: Frequency of average manual rating for the training, internal testing, and external testing datasets. D-F: The pairwise weighted-κ between each rater in dataset was moderate and consistent across datasets. G-I: The pairwise polychoric correlation for each rater in all of the datasets was high.
Figure 3
Figure 3. Manual quality rating varies by age
Image quality improves with age during adolescence in both training (A) and internal testing samples (B) using PNC data, whereas data quality declines with aging over the adult lifespan in the external test dataset (C). In A-C, dark line represents a linear fit; shaded envelope represents 95% confidence intervals; reported significance values are calculated using partial Spearman's correlations after regressing out gender trends.
Figure 4
Figure 4. Quantitative metrics of image quality show heterogeneous alignment with manual ratings
A: The standardized mean (+/- S.E.M.) for each quantitative metric is displayed by average manual rating class. B: Partial Spearman correlation coefficients between average manual quality rating and the T1 derived quantitative metrics; covariates included sex, age, and age squared. Across all datasets, Euler number showed the strongest association with manual quality ratings.
Figure 5
Figure 5. Inclusion model to identify unusable images
A-C: Logistic models in training (A), internal testing (B), and external testing (C) datasets were used to evaluate the ability of each quantitative measure of image quality to discriminate usable (rated 1-2) and unusable (rated 0) images. Area under the curve (AUC) was used to summarize model performance. In all datasets, the Euler number was the best-performing metric; adding additional metrics to the Euler number did not improve model performance. D-F: Receiver Operator Characteristic (ROC) curves for the Euler number in each dataset.
Figure 6
Figure 6. Limits of motion from functional scans as a proxy measure of T1 volume quality
A: Mean in-scanner motion during functional sequences acquired as part of the PNC increased over the course of the scanning session. Sequences are plotted in order of acquisition after the T1 scan; time from the T1 scan is reported in minutes: seconds within each bar. B: Individuals with lower-quality T1 images had differential attrition over the course of the of the scanning session. Thus, individuals with a lower-quality T1-images were less likely to complete the functional sequences which were subsequently acquired. Attrition scaled with quality of the T1 image. C: In participants for whom complete data was available (n=1275), motion estimated from the functional sequence did not perform as well as the Euler number in identifying unusable images (rated “0”).
Figure 7
Figure 7. Quantitative measure of image quality is associated with cortical thickness
In usable images that were not excluded due to gross artifact, cortical thickness was significantly related to the Euler number in a regionally heterogeneous pattern. Higher data quality was associated with thicker cortex over much of the brain, but was conversely associated with thinner cortex in occipital and posterior cingulate cortex. This pattern was present across all datasets. Image displays z-scores from a mass-univariate linear regression, where regional cortical thickness was the outcome and Euler number was the predictor of interest; covariates included age, age squared, and sex. All results corrected for multiple comparisons using the False Discovery Rate (q < 0.05).
Figure 8
Figure 8. Data quality significantly mediates observed associations with age in youth
Having found that data quality is associated with both age and cortical thickness, we evaluated whether data quality might systematically bias inference regarding brain development. To do this, a mediation analysis was performed for each cortical region (A), where we evaluated if the Euler number mediated the apparent relationship between age and cortical thickness. At each region, Sobel z-scores were calculated as the test statistic for the mediation analysis. A positive Sobel's value indicates that when controlling for data quality an increased effect of age was revealed; a negative Sobel's value indicates that when controlling for data quality a diminished association with age was present (B). This procedure was applied to both the training (C) and internal test set (D) from the PNC, which revealed consistent mediation effects in both samples. Data quality significantly mediated the relationship between age and cortical thickness in a bidirectional, regionally heterogeneous manner. After controlling for data quality, the apparent age effect was increased in many regions (regions in warm colors), where higher data quality was associated with thicker cortex (see Figure 7). However, in a subset of regions including the occipital and posterior cingulate cortex, controlling for data quality resulted in a diminished association with age (cool colors). Multiple comparisons were accounted for using FDR (q <0.05).

References

    1. Alexander-Bloch A, Clasen L, Stockman M, Ronan L, Lalonde F, Giedd J, Raznahan A. Subtle in-scanner motion biases automated measurement of brain anatomy from in vivo MRI. Hum Brain Mapp. 2016;37:2385–2397. https://doi.org/10.1002/hbm.23180. - DOI - PMC - PubMed
    1. Atkinson D, Hill DL, Stoyle PN, Summers PE, Keevil SF. Automatic correction of motion artifacts in magnetic resonance images using an entropy focus criterion. IEEE Trans Med Imaging. 1997;16:903–910. https://doi.org/10.1109/42.650886. - DOI - PubMed
    1. Avants BB, Kandel BM, Duda JT, Cook PA, Tustison NJ, Shrinidhi K. ANTsR: ANTs in R: quantification tools for biomedical images 2016
    1. Bellon E, Haacke E, Coleman P, Sacco D, Steiger D, Gangarosa R. MR artifacts: a review. Am J Roentgenol. 1986;147:1271–1281. https://doi.org/10.2214/ajr.147.6.1271. - DOI - PubMed
    1. Chalavi S, Simmons A, Dijkstra H, Barker GJ, Reinders AATS. Quantitative and qualitative assessment of structural magnetic resonance imaging data in a two-center study. BMC Med Imaging. 2012;12:27. https://doi.org/10.1186/1471-2342-12-27. - DOI - PMC - PubMed

Publication types