Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 13;11(10):1021.
doi: 10.3390/bioengineering11101021.

Convolutional Neural Network Incorporating Multiple Attention Mechanisms for MRI Classification of Lumbar Spinal Stenosis

Affiliations

Convolutional Neural Network Incorporating Multiple Attention Mechanisms for MRI Classification of Lumbar Spinal Stenosis

Juncai Lin et al. Bioengineering (Basel). .

Abstract

Background: Lumbar spinal stenosis (LSS) is a common cause of low back pain, especially in the elderly, and accurate diagnosis is critical for effective treatment. However, manual diagnosis using MRI images is time consuming and subjective, leading to a need for automated methods.

Objective: This study aims to develop a convolutional neural network (CNN)-based deep learning model integrated with multiple attention mechanisms to improve the accuracy and robustness of LSS classification via MRI images.

Methods: The proposed model is trained on a standardized MRI dataset sourced from multiple institutions, encompassing various lumbar degenerative conditions. During preprocessing, techniques such as image normalization and data augmentation are employed to enhance the model's performance. The network incorporates a Multi-Headed Self-Attention Module, a Slot Attention Module, and a Channel and Spatial Attention Module, each contributing to better feature extraction and classification.

Results: The model achieved 95.2% classification accuracy, 94.7% precision, 94.3% recall, and 94.5% F1 score on the validation set. Ablation experiments confirmed the significant impact of the attention mechanisms in improving the model's classification capabilities.

Conclusion: The integration of multiple attention mechanisms enhances the model's ability to accurately classify LSS in MRI images, demonstrating its potential as a tool for automated diagnosis. This study paves the way for future research in applying attention mechanisms to the automated diagnosis of lumbar spinal stenosis and other complex spinal conditions.

Keywords: attention mechanisms; deep learning; lumbar spinal stenosis; medical image analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Workflow of the MRI image classification system for lumbar spinal stenosis.
Figure 2
Figure 2
Workflow of dataset preprocessing.
Figure 3
Figure 3
Overall architecture of the proposed model. The model consists of three major parts: the head, body, and tail modules. The head module includes convolutional layers (Conv), Batch Normalization (BN), and Enhanced Inception Modules (EIM) for feature extraction, followed by Max-Pooling (Max-Pool) layers to downsample the feature maps. The body module incorporates four attention mechanisms: the Channel Attention Module (CAM), Spatial Attention Module (SPAM), Multi-Head Self-Attention Module (MHSAM), and Slot Attention Module (SAM), which collectively enhance feature selection and improve classification performance. Additionally, the Convolutional Block Attention Module (CBAM) combines the Channel Attention Module and the Spatial Attention Module to refine features in both channel and spatial dimensions. The tail module applies Global Average Pooling (GAP), fully connected (FC) layers, Layer Normalization (LN), and Dropout to refine the final classification output.
Figure 4
Figure 4
Structure of the Enhanced Inception Module. This module consists of multiple parallel paths for extracting features at various scales. Depth Separable Convolutions (DSC) of varying kernel sizes (1 × 1, 3 × 3, 5 × 5) are applied in parallel, along with a Max-Pooling (3 × 3) operation. The results from all branches are concatenated (Filter concatenation) before being passed through a 1 × 1 convolution, followed by Batch Normalization (BN) and a ReLU activation function. This structure allows for efficient multi-scale feature extraction while reducing computational complexity through depth-wise separable convolutions.
Figure 5
Figure 5
Structure of the Channel Attention Module (CAM). The CAM begins by applying Global Average Pooling (GAP) and Global Max Pooling (GMP) operations to the input feature map to capture channel-wise statistics. These pooled feature maps are then processed independently through convolutional layers followed by a ReLU activation function. The outputs from both pathways are summed and passed through another convolutional layer to generate the channel attention weights, which are multiplied with the original input feature map to refine it along the channel dimension, highlighting the most informative channels.
Figure 6
Figure 6
Structure of the Spatial Attention Module (SPAM). The SPAM module first computes the average and max-pooling across the channel dimension of the input feature map. The resulting two spatial feature maps are concatenated along the channel axis, forming a combined representation of spatial information. This concatenated feature map is then passed through a convolutional layer followed by a sigmoid activation to generate spatial attention weights. These weights are multiplied with the input feature map, focusing the model’s attention on the most relevant spatial regions, thus improving feature localization for subsequent layers.
Figure 7
Figure 7
Structure of the MHSAM. The MHSAM employs multi-head self-attention to enhance the model’s ability to focus on different aspects of the input feature representation. The input feature map is first linearly projected into query (Q), key (K), and value (V) matrices. Each of these matrices is split into multiple heads, which allows the model to attend to information at different positions simultaneously. The scaled dot-product attention (SDPA) is computed for each head, capturing the relationships between different spatial locations in the feature map. Finally, the outputs from all heads are concatenated and transformed through a fully connected (FC) layer to generate the refined feature representation, which is passed on to subsequent layers for further processing.
Figure 8
Figure 8
Comparison of the proposed model with other models.
Figure 9
Figure 9
Confusion matrix for the DenseNet201 model. The matrix illustrates the classification performance of the DenseNet201 model, with 0 denoting normal or mild cases and 1 indicating severe cases. Furthermore, darker colors represent higher accuracy for the corresponding class.
Figure 10
Figure 10
Confusion matrix for the proposed model. This matrix presents the classification outcomes for the proposed model, with 0 representing normal or mild cases and 1 denoting severe cases. Compared to the DenseNet201 model (Figure 9), the proposed model demonstrates improved accuracy, particularly in reducing false positives for severe cases, suggesting its potential for more reliable clinical application. Additionally, darker colors represent a higher level of accuracy in classification.
Figure 11
Figure 11
ROC curves for DenseNet201 and proposed model across conditions.
Figure 12
Figure 12
Misclassified MRI images in lumbar spinal stenosis diagnosis: severe cases incorrectly labeled as normal/mild.

Similar articles

Cited by

References

    1. Ravindra V.M., Senglaub S.S., Rattani A., Dewan M.C., Härtl R., Bisson E., Park K.B., Shrime M.G. Degenerative Lumbar Spine Disease: Estimating Global Incidence and Worldwide Volume. Glob. Spine J. 2018;8:784–794. doi: 10.1177/2192568218770769. - DOI - PMC - PubMed
    1. Deyo R.A., Gray D.T., Kreuter W., Mirza S., Martin B.I. United States Trends in Lumbar Fusion Surgery for Degenerative Conditions. Spine. 2005;30:1441. doi: 10.1097/01.brs.0000166503.37969.8a. - DOI - PubMed
    1. Wei F.-L., Zhou C.-P., Liu R., Zhu K.-L., Du M.-R., Gao H.-R., Wu S.-D., Sun L.-L., Yan X.-D., Liu Y., et al. Management for Lumbar Spinal Stenosis: A Network Meta-Analysis and Systematic Review. Int. J. Surg. 2021;85:19–28. doi: 10.1016/j.ijsu.2020.11.014. - DOI - PubMed
    1. Jensen R.K., Jensen T.S., Koes B., Hartvigsen J. Prevalence of Lumbar Spinal Stenosis in General and Clinical Populations: A Systematic Review and Meta-Analysis. Eur. Spine J. 2020;29:2143–2163. doi: 10.1007/s00586-020-06339-1. - DOI - PubMed
    1. Kwon J., Moon S.-H., Park S.-Y., Park S.-J., Park S.-R., Suk K.-S., Kim H.-S., Lee B.H. Lumbar Spinal Stenosis: Review Update 2022. Asian Spine J. 2022;16:789–798. doi: 10.31616/asj.2022.0366. - DOI - PMC - PubMed

LinkOut - more resources