Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2025 Aug;66(8):491-501.
doi: 10.3349/ymj.2024.0246.

Deep Learning-Based Landmark Detection Model for Multiple Foot Deformity Classification: A Dual-Center Study

Affiliations
Multicenter Study

Deep Learning-Based Landmark Detection Model for Multiple Foot Deformity Classification: A Dual-Center Study

Su Ji Lee et al. Yonsei Med J. 2025 Aug.

Abstract

Purpose: To introduce heatmap-in-heatmap (HIH)-based model for automated diagnosis of foot deformities using weight-bearing foot radiographs, aiming to address the labor-intensive and variable nature of manual diagnosis.

Materials and methods: From January 2004 to September 2022, a dual-center retrospective study was conducted. In the first center, 1561 anterior-posterior (AP) and 1536 lateral images from 806 patients were used for model training, while 374 AP and 373 lateral images from 196 patients were allocated to the validation set. For external validation at the second center, 527 AP and 529 lateral images from 270 patients were allocated. Five deformities were diagnosed using four and three angles between the predicted landmarks in the AP and lateral images, respectively. The results were compared with those of the baseline model (FlatNet).

Results: The HIH model demonstrated robust performance in diagnosing multiple foot deformities. On the test set, it outperformed FlatNet with higher accuracy (FlatNet vs. HIH: 78.9% vs. 85.1%), sensitivity (78.9% vs. 84.1%), specificity (79.0% vs. 85.9%), positive predictive value (77.3% vs. 84.4%), and negative predictive value (80.5% vs. 85.7%). Additionally, HIH exhibited significantly lower absolute pixel and angle errors, lower normalized mean errors, higher successful detection rate, faster training and inference speeds, and fewer parameters.

Conclusion: The HIH model showed robust performance in diagnosing multiple foot deformities with high efficacy in internal and external validation. Our approach is expected to be effective for various tasks using landmarks in medical imaging.

Keywords: Artificial intelligence; deep learning; diagnostic imaging; foot deformities.

PubMed Disclaimer

Conflict of interest statement

The authors have no potential conflicts of interest to disclose.

Figures

Fig. 1
Fig. 1. Flowchart of study design and dataset construction. AP, anterior-posterior.
Fig. 2
Fig. 2. Overview of radiographic landmarks and diagnostic angles. A total of 16 (A1–A16) and 11 (L1–L11) landmarks were annotated in the AP (A) and lateral images (B), respectively. Thereafter, four angles in the AP images and three angles in the lateral images were measured. The anatomical description of the landmarks is described in the Supplementary Fig. 1 (only online). AP, anterior-posterior.
Fig. 3
Fig. 3. Overall frameworks of our proposed approach. (A) Two models are trained to predict AP and lateral landmarks, and diagnoses for the five-foot deformities are performed using the angles between the landmarks. Criteria for non-adults are described in the Supplementary Table 1 (only online). (B) Overall framework of our HIH model. The model outputs two types of heatmaps (integer and decimal heatmaps) for N landmarks. During training, two losses (regression and classification loss terms) are used. AP, anterior-posterior; HIH, heatmap-in-heatmap.
Fig. 4
Fig. 4. Prediction results between the two models in AP and lateral images from the same patient. The ground truth and predicted points are colored blue and red, respectively. (A) In the AP image, our model predicts all landmarks better overall. (B) In the lateral image, the predictions of the two models are similar in most cases. However, the outputs of FlatNet have an outlier with a large pixel error (yellow arrow). AP, anterior-posterior.
Fig. 5
Fig. 5. (A) Normalized mean error for the landmarks in AP and lateral view images of the validation set. (B) Normalized mean error for the landmarks in AP and lateral view images of the test set. A lower NME value indicates a lesser pixel distance error between the ground truth and predicted coordinates. NME, normalized mean error; AP, anterior-posterior.
Fig. 6
Fig. 6. Comparison of the computational efficiency between the two models. All the measurements were conducted using a single NVIDIA TITAN Xp 12GB GPU, in lateral images.

Similar articles

References

    1. Yildiz K, Cetin T. Interobserver reliability in the radiological evaluation of flatfoot (pes planus) deformity: a cross-sectional study. J Foot Ankle Surg. 2022;61:1065–1070. - PubMed
    1. Ryu SM, Shin K, Shin SW, Lee S, Kim N. Enhancement of evaluating flatfoot on a weight-bearing lateral radiograph of the foot with U-Net based semantic segmentation on the long axis of tarsal and metatarsal bones in an active learning manner. Comput Biol Med. 2022;145:105400. - PubMed
    1. Ryu SM, Shin K, Shin SW, Lee SH, Seo SM, Cheon SU, et al. Automated landmark identification for diagnosis of the deformity using a cascade convolutional neural network (FlatNet) on weight-bearing lateral radiographs of the foot. Comput Biol Med. 2022;148:105914. - PubMed
    1. Koo J, Hwang S, Han SH, Lee J, Lee HS, Park G, et al. Deep learning-based tool affects reproducibility of pes planus radiographic assessment. Sci Rep. 2022;12:12891. - PMC - PubMed
    1. Ryu SM, Shin K, Shin SW, Lee SH, Seo SM, Cheon SU, et al. Automated diagnosis of flatfoot using cascaded convolutional neural network for angle measurements in weight-bearing lateral radiographs. Eur Radiol. 2023;33:4822–4832. - PubMed

Publication types

LinkOut - more resources