. 2024 Oct 13;11(10):1021.

doi: 10.3390/bioengineering11101021.

Convolutional Neural Network Incorporating Multiple Attention Mechanisms for MRI Classification of Lumbar Spinal Stenosis

Juncai Lin¹, Honglai Zhang¹, Hongcai Shang^{1

2

3}

Affiliations

¹ School of Medical Information Engineering, Guangzhou University of Chinese Medicine, Guangzhou 510006, China.
² Dongfang Hospital, Beijing University of Chinese Medicine, Beijing 100078, China.
³ Key Laboratory of Chinese Internal Medicine of Ministry of Education, Beijing University of Chinese Medicine, Beijing 100700, China.

PMID: 39451397
PMCID: PMC11504910
DOI: 10.3390/bioengineering11101021

Convolutional Neural Network Incorporating Multiple Attention Mechanisms for MRI Classification of Lumbar Spinal Stenosis

Juncai Lin et al. Bioengineering (Basel). 2024.

. 2024 Oct 13;11(10):1021.

doi: 10.3390/bioengineering11101021.

Authors

Juncai Lin¹, Honglai Zhang¹, Hongcai Shang^{1

2

3}

Affiliations

¹ School of Medical Information Engineering, Guangzhou University of Chinese Medicine, Guangzhou 510006, China.
² Dongfang Hospital, Beijing University of Chinese Medicine, Beijing 100078, China.
³ Key Laboratory of Chinese Internal Medicine of Ministry of Education, Beijing University of Chinese Medicine, Beijing 100700, China.

PMID: 39451397
PMCID: PMC11504910
DOI: 10.3390/bioengineering11101021

Abstract

Background: Lumbar spinal stenosis (LSS) is a common cause of low back pain, especially in the elderly, and accurate diagnosis is critical for effective treatment. However, manual diagnosis using MRI images is time consuming and subjective, leading to a need for automated methods.

Objective: This study aims to develop a convolutional neural network (CNN)-based deep learning model integrated with multiple attention mechanisms to improve the accuracy and robustness of LSS classification via MRI images.

Methods: The proposed model is trained on a standardized MRI dataset sourced from multiple institutions, encompassing various lumbar degenerative conditions. During preprocessing, techniques such as image normalization and data augmentation are employed to enhance the model's performance. The network incorporates a Multi-Headed Self-Attention Module, a Slot Attention Module, and a Channel and Spatial Attention Module, each contributing to better feature extraction and classification.

Results: The model achieved 95.2% classification accuracy, 94.7% precision, 94.3% recall, and 94.5% F1 score on the validation set. Ablation experiments confirmed the significant impact of the attention mechanisms in improving the model's classification capabilities.

Conclusion: The integration of multiple attention mechanisms enhances the model's ability to accurately classify LSS in MRI images, demonstrating its potential as a tool for automated diagnosis. This study paves the way for future research in applying attention mechanisms to the automated diagnosis of lumbar spinal stenosis and other complex spinal conditions.

Keywords: attention mechanisms; deep learning; lumbar spinal stenosis; medical image analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

**Figure 1**
Workflow of the MRI image classification system for lumbar spinal stenosis.

**Figure 2**
Workflow of dataset preprocessing.

**Figure 3**
Overall architecture of the proposed model. The model consists of three major parts: the head, body, and tail modules. The head module includes convolutional layers (Conv), Batch Normalization (BN), and Enhanced Inception Modules (EIM) for feature extraction, followed by Max-Pooling (Max-Pool) layers to downsample the feature maps. The body module incorporates four attention mechanisms: the Channel Attention Module (CAM), Spatial Attention Module (SPAM), Multi-Head Self-Attention Module (MHSAM), and Slot Attention Module (SAM), which collectively enhance feature selection and improve classification performance. Additionally, the Convolutional Block Attention Module (CBAM) combines the Channel Attention Module and the Spatial Attention Module to refine features in both channel and spatial dimensions. The tail module applies Global Average Pooling (GAP), fully connected (FC) layers, Layer Normalization (LN), and Dropout to refine the final classification output.

**Figure 4**
Structure of the Enhanced Inception Module. This module consists of multiple parallel paths for extracting features at various scales. Depth Separable Convolutions (DSC) of varying kernel sizes (1 × 1, 3 × 3, 5 × 5) are applied in parallel, along with a Max-Pooling (3 × 3) operation. The results from all branches are concatenated (Filter concatenation) before being passed through a 1 × 1 convolution, followed by Batch Normalization (BN) and a ReLU activation function. This structure allows for efficient multi-scale feature extraction while reducing computational complexity through depth-wise separable convolutions.

**Figure 5**
Structure of the Channel Attention Module (CAM). The CAM begins by applying Global Average Pooling (GAP) and Global Max Pooling (GMP) operations to the input feature map to capture channel-wise statistics. These pooled feature maps are then processed independently through convolutional layers followed by a ReLU activation function. The outputs from both pathways are summed and passed through another convolutional layer to generate the channel attention weights, which are multiplied with the original input feature map to refine it along the channel dimension, highlighting the most informative channels.

**Figure 6**
Structure of the Spatial Attention Module (SPAM). The SPAM module first computes the average and max-pooling across the channel dimension of the input feature map. The resulting two spatial feature maps are concatenated along the channel axis, forming a combined representation of spatial information. This concatenated feature map is then passed through a convolutional layer followed by a sigmoid activation to generate spatial attention weights. These weights are multiplied with the input feature map, focusing the model’s attention on the most relevant spatial regions, thus improving feature localization for subsequent layers.

**Figure 7**
Structure of the MHSAM. The MHSAM employs multi-head self-attention to enhance the model’s ability to focus on different aspects of the input feature representation. The input feature map is first linearly projected into query (Q), key (K), and value (V) matrices. Each of these matrices is split into multiple heads, which allows the model to attend to information at different positions simultaneously. The scaled dot-product attention (SDPA) is computed for each head, capturing the relationships between different spatial locations in the feature map. Finally, the outputs from all heads are concatenated and transformed through a fully connected (FC) layer to generate the refined feature representation, which is passed on to subsequent layers for further processing.

**Figure 8**
Comparison of the proposed model with other models.

**Figure 9**
Confusion matrix for the DenseNet201 model. The matrix illustrates the classification performance of the DenseNet201 model, with 0 denoting normal or mild cases and 1 indicating severe cases. Furthermore, darker colors represent higher accuracy for the corresponding class.

**Figure 10**
Confusion matrix for the proposed model. This matrix presents the classification outcomes for the proposed model, with 0 representing normal or mild cases and 1 denoting severe cases. Compared to the DenseNet201 model (Figure 9), the proposed model demonstrates improved accuracy, particularly in reducing false positives for severe cases, suggesting its potential for more reliable clinical application. Additionally, darker colors represent a higher level of accuracy in classification.

**Figure 11**
ROC curves for DenseNet201 and proposed model across conditions.

**Figure 12**
Misclassified MRI images in lumbar spinal stenosis diagnosis: severe cases incorrectly labeled as normal/mild.

See this image and copyright information in PMC

Cited by

Information Geometry and Manifold Learning: A Novel Framework for Analyzing Alzheimer's Disease MRI Data.
Akgüller Ö, Balcı MA, Cioca G. Akgüller Ö, et al. Diagnostics (Basel). 2025 Jan 10;15(2):153. doi: 10.3390/diagnostics15020153. Diagnostics (Basel). 2025. PMID: 39857036 Free PMC article.
EPI-DynFusion: enhancer-promoter interaction prediction model based on sequence features and dynamic fusion mechanisms.
Zhang A, Jia J, Sun M, Wei X. Zhang A, et al. Front Genet. 2025 Jul 23;16:1614222. doi: 10.3389/fgene.2025.1614222. eCollection 2025. Front Genet. 2025. PMID: 40772277 Free PMC article.
Deep Learning-Based Diagnosis of Lumbar Spondylolisthesis Using X-Ray Imaging.
Xu C, Wu Y, Bao B, Liu X, Zhang Y, Li R, Yang T, Tang J. Xu C, et al. Diagnostics (Basel). 2025 Aug 12;15(16):2015. doi: 10.3390/diagnostics15162015. Diagnostics (Basel). 2025. PMID: 40870867 Free PMC article.

References

1. Ravindra V.M., Senglaub S.S., Rattani A., Dewan M.C., Härtl R., Bisson E., Park K.B., Shrime M.G. Degenerative Lumbar Spine Disease: Estimating Global Incidence and Worldwide Volume. Glob. Spine J. 2018;8:784–794. doi: 10.1177/2192568218770769. - DOI - PMC - PubMed
1. Deyo R.A., Gray D.T., Kreuter W., Mirza S., Martin B.I. United States Trends in Lumbar Fusion Surgery for Degenerative Conditions. Spine. 2005;30:1441. doi: 10.1097/01.brs.0000166503.37969.8a. - DOI - PubMed
1. Wei F.-L., Zhou C.-P., Liu R., Zhu K.-L., Du M.-R., Gao H.-R., Wu S.-D., Sun L.-L., Yan X.-D., Liu Y., et al. Management for Lumbar Spinal Stenosis: A Network Meta-Analysis and Systematic Review. Int. J. Surg. 2021;85:19–28. doi: 10.1016/j.ijsu.2020.11.014. - DOI - PubMed
1. Jensen R.K., Jensen T.S., Koes B., Hartvigsen J. Prevalence of Lumbar Spinal Stenosis in General and Clinical Populations: A Systematic Review and Meta-Analysis. Eur. Spine J. 2020;29:2143–2163. doi: 10.1007/s00586-020-06339-1. - DOI - PubMed
1. Kwon J., Moon S.-H., Park S.-Y., Park S.-J., Park S.-R., Suk K.-S., Kim H.-S., Lee B.H. Lumbar Spinal Stenosis: Review Update 2022. Asian Spine J. 2022;16:789–798. doi: 10.31616/asj.2022.0366. - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources
- MDPI
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Convolutional Neural Network Incorporating Multiple Attention Mechanisms for MRI Classification of Lumbar Spinal Stenosis

Affiliations

Convolutional Neural Network Incorporating Multiple Attention Mechanisms for MRI Classification of Lumbar Spinal Stenosis

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources