Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 26;7(1):303.
doi: 10.1038/s41746-024-01301-7.

Clinically applicable optimized periprosthetic joint infection diagnosis via AI based pathology

Affiliations

Clinically applicable optimized periprosthetic joint infection diagnosis via AI based pathology

Ye Tao et al. NPJ Digit Med. .

Abstract

Periprosthetic joint infection (PJI) is a severe complication after joint replacement surgery that demands precise diagnosis for effective treatment. We enhanced PJI diagnostic accuracy through three steps: (1) developing a self-supervised PJI model with DINO v2 to create a large dataset; (2) comparing multiple intelligent models to identify the best one; and (3) using the optimal model for visual analysis to refine diagnostic practices. The self-supervised model generated 27,724 training samples and achieved a perfect AUC of 1, indicating flawless case differentiation. EfficientNet v2-S outperformed CAMEL2 at the image level, while CAMEL2 was superior at the patient level. By using the weakly supervised PJI model to adjust diagnostic criteria, we reduced the required high-power field diagnoses per slide from five to three. These findings demonstrate AI's potential to improve the accuracy and standardization of PJI pathology and have significant implications for infectious disease diagnostics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. The flowchart illustrating the study design.
Purple arrows indicate input, black arrows indicate output, flames represent trainable components, and locks denote testing-only components. a Data processing: WSI datasets were segmented into 600 × 600-pixel patches and divided for DINO v2 training, testing, and additional training. b Self-supervised model and augmentation: b1 pathological images trained the DINO v2 model. b2 The DINO v2 backbone extracted features, with the fully connected layer trained. b3 Test data were reserved for testing, with additional data used for self-supervised tasks. b4 Self-supervised model testing results. c Multi-model training: c1 expert-reviewed data trained various models. c2 and c3 Each model was optimized, tested, and compared.
Fig. 2
Fig. 2. The ROC curves for ResNet-50, CNN, and MobileNet v3.
This figure presents confusion matrices resulting from pairwise comparisons among three physicians and two models. a, b, and c represent ResNet-50, CNN, and MobileNet v3, respectively. The x-axis represents 1-specificity and the y-axis represents sensitivity. As 1-specificity increases, sensitivity rises. The AUC, the area under the curve, is close to 1, indicating high diagnostic performance.
Fig. 3
Fig. 3. The performance of the PJI supervised and weakly supervised learning models.
s- refers to the corresponding test results of the PJI supervised learning model, and w- refers to the corresponding test results of the PJI weakly supervised learning model. The red line represents the ROC curve of the PJI supervised learning model, and the blue line represents the ROC curve of the PJI weakly supervised learning model. a Image-level comparison of sensitivity and specificity. b Patient-level comparison of sensitivity and specificity. c Image-level accuracy, recall, and F1 score of the models. d Patient-level accuracy, recall, and F1 score of the models. e Image-level ROC curves for the two models. f Patient-level ROC curves for the two models. g The degree of data dispersion at the image level. The weakly supervised model has a mean ± standard deviation of 0.03433 ± 0.02211 for the negative set and 0.2059 ± 0.05993 for the positive set; the supervised model has 0.03780 ± 0.02328 and 0.2614 ± 0.1009, respectively. h Loss curves for the PJI supervised learning model. i Loss curves for the PJI weakly supervised learning model.
Fig. 4
Fig. 4. The human–machine comparison test result.
1/2/3 correspond to the diagnostic results of Experts 1/2/3, indicated on the horizontal axis; the symbols s/w represent the diagnostic results of the PJI supervised and weakly supervised models, indicated on the vertical axis. a, b, and c show the confusion matrix results comparing PJI supervised models with Experts 1/2/3, while d, e, and f present the confusion matrix results comparing PJI weakly supervised models with Experts 1/2/3. The darker the red, the larger the number. The top-left and bottom-right squares represent areas where the experts’ diagnoses and the model’s diagnoses are the same, while the other squares represent areas where the diagnoses differ.
Fig. 5
Fig. 5. The visual differences between the supervised learning (s-model) and weakly supervised learning (w-model) models.
The three-dimensional data formed by the w-model is notably distant from the coordinate point, whereas the s-model is closer to the coordinate origin. This indicates that, on average, the w-model outperforms the s-model in terms of accuracy, completeness, and reliability, thereby displaying superior visualization effects.
Fig. 6
Fig. 6. The visual outcomes of the PJI intelligent pathological diagnosis model.
From left to right, the images represent a whole image slide, a visualization heatmap of the PJI supervised learning model, and a visualization heatmap of the PJI weakly supervised learning model. The color gradient from light to dark indicates diagnostic weight from low to high. a The tissue shows not only an aggregation of neutrophils but also a loss of its original structure, becoming more porous. b Differences in the cytoplasm and nuclear morphology of neutrophils are observed, depending on their proximity to blood vessels (indicated by the direction of the red arrow).
Fig. 7
Fig. 7. Architecture of PJI supervised learning model.
Using EfficientNet v2-S as the backbone, the model begins with a convolutional layer (conv 3 × 3) with a stride of 2, followed by a series of Fused-MBConv and MBConv blocks, where strides vary between 1 and 2. Some blocks include SE (Squeeze-and-Excitation) ratios. The network concludes with a conv 1 × 1 layer, followed by pooling and a fully connected layer, resulting in an output of 1280 channels. The model is implemented in TensorFlow using the Adam optimizer, and its weights are compared on the validation set every 100 steps.
Fig. 8
Fig. 8. Architecture of the PJI weakly supervised learning model.
The model has three components: cMIL, Label Enrichment, and Segmentation. cMIL performs fine-grained segmentation, Label Enrichment extends image data, and Segmentation re-segments the image. Using CAMEL2, we transform coarse-grained labeling into a fine-grained classification task, generating pseudo-labels and applying Multi-Instance Learning (MIL) to create an instance-level dataset. Terahertz images are divided into grids that share label information with the entire image. Positive and negative samples form patch-level pairs, with images expanded to 2048 × 2048 pixels, segmented into 256 × 256 grid instances, and processed with softmax to obtain probabilities. In negative samples, instances inherit a label of 0, while in positive samples, the top K% of confident instances are selected as positive. Cross-entropy loss updates the model during backpropagation.
Fig. 9
Fig. 9. Architecture of PJI self-supervised learning model.
A teacher–student model structure with different data augmentations is used, where the teacher model is updated using the student model’s exponential moving average (EMA). Both networks feature a ViT backbone, a projection head, and use temperature softmax. DINO v2 introduces patch tokens and masking, with the student network projecting masked views and the teacher network projecting unmasked views. The training objective of iBOT is defined based on this setup. DINO v2 model learns representations of unlabeled pathological sections through a self-supervised learning loss function.

References

    1. Evans, J. T. et al. How long does a knee replacement last? A systematic review and meta-analysis of case series and national registry reports with more than 15 years of follow-up. Lancet393, 655–663 (2019). - DOI - PMC - PubMed
    1. Kapadia, B. H. et al. Periprosthetic joint infection. Lancet387, 386–394 (2016). - DOI - PubMed
    1. Tande, A. J. & Patel, R. Prosthetic joint infection. Clin. Microbiol. Rev.27, 302–345 (2014). - DOI - PMC - PubMed
    1. Lenguerrand, E. et al. Risk factors associated with revision for prosthetic joint infection after hip replacement: a prospective observational cohort study. Lancet Infect. Dis.18, 1004–1014 (2018). - DOI - PMC - PubMed
    1. Lenguerrand, E. et al. Risk factors associated with revision for prosthetic joint infection following knee replacement: an observational cohort study from England and Wales. Lancet Infect. Dis.19, 589–600 (2019). - DOI - PMC - PubMed

LinkOut - more resources