Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Nov;35(11):6865-6878.
doi: 10.1007/s00330-025-11670-6. Epub 2025 May 17.

AI in motion: the impact of data augmentation strategies on mitigating MRI motion artifacts

Affiliations

AI in motion: the impact of data augmentation strategies on mitigating MRI motion artifacts

Simon D Westfechtel et al. Eur Radiol. 2025 Nov.

Abstract

Objectives: Artifacts in clinical MRI can compromise the performance of AI models. This study evaluates how different data augmentation strategies affect an AI model's segmentation performance under variable artifact severity.

Materials and methods: We used an AI model based on the nnU-Net architecture to automatically quantify lower limb alignment using axial T2-weighted MR images. Three versions of the AI model were trained with different augmentation strategies: (1) no augmentation ("baseline"), (2) standard nnU-net augmentations ("default"), and (3) "default" plus augmentations that emulate MR artifacts ("MRI-specific"). Model performance was tested on 600 MR image stacks (right and left; hip, knee, and ankle) from 20 healthy participants (mean age, 23 ± 3 years, 17 men), each imaged five times under standardized motion to induce artifacts. Two radiologists graded each stack's artifact severity as none, mild, moderate, and severe, and manually measured torsional angles. Segmentation quality was assessed using the Dice similarity coefficient (DSC), while torsional angles were compared between manual and automatic measurements using mean absolute deviation (MAD), intraclass correlation coefficient (ICC), and Pearson's correlation coefficient (r). Statistical analysis included parametric tests and a Linear Mixed-Effects Model.

Results: MRI-specific augmentation resulted in slightly (yet not significantly) better performance than the default strategy. Segmentation quality decreased with increasing artifact severity, which was partially mitigated by default and MRI-specific augmentations (e.g., severe artifacts, proximal femur: DSCbaseline = 0.58 ± 0.22; DSCdefault = 0.72 ± 0.22; DSCMRI-specific = 0.79 ± 0.14 [p < 0.001]). These augmentations also maintained precise torsional angle measurements (e.g., severe artifacts, femoral torsion: MADbaseline = 20.6 ± 23.5°; MADdefault = 7.0 ± 13.0°; MADMRI-specific = 5.7 ± 9.5° [p < 0.001]; ICCbaseline = -0.10 [p = 0.63; 95% CI: -0.61 to 0.47]; ICCdefault = 0.38 [p = 0.08; -0.17 to 0.76]; ICCMRI-specific = 0.86 [p < 0.001; 0.62 to 0.95]; rbaseline = 0.58 [p < 0.001; 0.44 to 0.69]; rdefault = 0.68 [p < 0.001; 0.56 to 0.77]; rMRI-specific = 0.86 [p < 0.001; 0.81 to 0.9]).

Conclusion: Motion artifacts negatively impact AI models, but general-purpose augmentations enhance robustness effectively. MRI-specific augmentations offer minimal additional benefit.

Key points: Question Motion artifacts negatively impact the performance of diagnostic AI models for MRI, but mitigation methods remain largely unexplored. Findings Domain-specific augmentation during training can improve the robustness and performance of a model for quantifying lower limb alignment in the presence of severe artifacts. Clinical relevance Excellent robustness and accuracy are crucial for deploying diagnostic AI models in clinical practice. Including domain knowledge in model training can benefit clinical adoption.

Keywords: Artifacts; Artificial intelligence; Lower limbs; Magnetic resonance imaging; Torsion abnormality.

PubMed Disclaimer

Conflict of interest statement

Compliance with ethical standards. Guarantor: The scientific guarantor of this publication is Simon Westfechtel. Conflict of interest: D.T. received honoraria for lectures by Bayer, GE, and Philips and holds shares in StratifAI GmbH, Germany, and in Synagen GmbH, Germany, neither of whom have supported or influenced this study. All ethical standards have been strictly adhered to. Statistics and biometry: One of the authors has significant statistical expertise. Informed consent: Written informed consent was obtained from all subjects in this study. Written informed consent was waived for all patients by the Institutional Review Board. Ethical approval: Institutional Review Board approval was obtained. Study subjects or cohorts overlap: Some study subjects or cohorts have been previously reported in Schock et al [13]. Methodology: Prospective Cross-sectional study Performed at one institution

Figures

Fig. 1
Fig. 1
Diagnostic AI model for quantifying lower limb torsion. The model inputs are axial MR images of the hips (a1), knees (a2, a3), and ankles (a4) (only the patient’s right side is shown). Using a convolutional neural network, the model outputs segmentation outlines of the femur (yellow; b1, b2), tibia (green; b3, b4), and fibula (blue; b4). Algorithmic post-processing then identifies anatomic landmarks based on these segmentation outlines and defines reference lines (red) according to the method by Lee et al (c1, c2) and the ellipses method (c3, c4). Femoral and tibial torsion are then quantified based on these reference lines. White circles indicate accessory geometric structures to identify the centers of the femoral head and neck (c1)
Fig. 2
Fig. 2
Standardized motion patterns for generating motion artifact-degraded MR images for the test set. Five axial T2-weighted non-fat-saturated 2D turbo-spin echo sequences were acquired consecutively under different conditions: with participants lying as still as possible (a), performing breath-synchronized repetitive (unilateral, yet alternating) gluteal contractions and relaxations (b) and breath-synchronized repetitive dorsiflexion (c1) and plantarflexion (c2). Red arrows indicate the direction of motion
Fig. 3
Fig. 3
Representative MR images showing motion artifact-induced image degradation. Axial T2-weighted non-fat-saturated images of the pelvis, displaying both hips in different participants, are shown. The images were evaluated for motion artifact-induced degradation and categorized as showing no (a), mild (b), moderate (c), and severe (d) degradation
Fig. 4
Fig. 4
Schematic of neural network architecture and model pipeline for automatic bone segmentation using different augmentation strategies. During training, the original MR images (a) were either left unchanged (b, “baseline”), augmented with default nnU-Net augmentations (c, “default”), or augmented with additional MRI-specific augmentation (d, “MRI-specific”). The neural network’s topological characteristics to delineate the femoral segmentation outlines (yellow) are shown. For illustrative purposes, “default” augmentations are shown as mirroring and contrast transformations, while “MRI-specific” augmentations include additional random motion, ghosting, and spiking. The blue box on the right details the steps not visualized in the blue box on the left. Here, “nnU-Net Preprocessing w/o Augmentation” refers to the standard preprocessing steps applied by nnU-Net, such as resampling, normalization, and cropping/padding, while excluding any data augmentation
Fig. 5
Fig. 5
Segmentation quality as a function of augmentation strategy and artifact severity. MR images with overlaid segmentation outlines of the hip are shown. The reference image is unaffected by artifacts, while the other images are affected by varying degrees of artifact severity, from mild to severe. Color coding and image overlays as in Fig. 1. Absence of reference lines (d2) indicates that no reference line could be computed. Improved segmentation and post hoc processing were observed with more extensive augmentation during training when fewer artifacts were present. For these images, the Dice Similarity Coefficients (DSC) were as follows. Reference image: 0.91 (a2), 0.96 (a3), 0.96 (a4). Image with mild artifacts: 0.58 (b2), 0.85 (b3), 0.84 (b4). Image with moderate artifacts: 0.51 (c2), 0.65 (c3), 0.82 (c4). Image with severe artifacts: 0.84 (d2), 0.89 (d3), 0.89 (d4). Manual and computed femoral torsional values (R/L [°]) were as follows. Reference image: 9.5/12.4 (a1), 12.9/12.7 (a2), 12.7/12.0 (a3), 10.6/11.8 (a4). Image with mild artifacts: 6.0/−0.3 (b1), 5.1/−0.5 (b2), 5.3/−0.4 (b3), 5.7/−0.3 (b4). Image with moderate artifacts: −0.9/−1.5 (c1), −0.1/−6.0 (c2), −0.1/−5.5 (c3), −0.9/−1.2 (c4). Image with severe artifacts: 10.3/12.2 (d1), 11.5/NA (d2), 11.5/16.0 (d3), 11.5/13.6 (d4)
Fig. 6
Fig. 6
Examples of failed automatic torsion measurement. MR images of the proximal femur (a) and distal tibia (b) with a high grade of artifact degradation. Computed segmentation outlines and reference lines are overlaid. Color coding and image overlays as in Fig. 1. The quality of the segmentation outlines computed by the baseline model (a1 and b1) was too poor for the model to determine any reference lines. In contrast, the models enhanced with default (a2 and b2) and MRI-specific (a3 and b3) augmentation produced segmentation outlines of sufficient quality for determining reference lines. There were no instances in the test data where the model failed to determine the knee reference lines

References

    1. Shi Z, He L (2010) Application of neural networks in medical image processing. In: Proceedings of the second international symposium on networking and network security. Citeseer, pp 2–4
    1. Chung CB, Pathria MN, Resnick D (2024) MRI in MSK: is it the ultimate examination? Skelet Radiol 53:1727–1735. 10.1007/s00256-024-04601-x - PubMed
    1. Budrys T, Veikutis V, Lukosevicius S et al (2018) Artifacts in magnetic resonance imaging: how it can really affect diagnostic image quality and confuse clinical diagnosis? J Vibroeng 20:1202–1213. 10.21595/jve.2018.19756
    1. Singh D, Chin M, Peh W (2014) Artifacts in musculoskeletal MR imaging. Semin Musculoskelet Radiol 18:012–022. 10.1055/s-0034-1365831 - PubMed
    1. Rafat Zand K, Reinhold C, Haider MA et al (2007) Artifacts and pitfalls in MR imaging of the pelvis. J Magn Reson Imaging 26:480–497 - PubMed

LinkOut - more resources