Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 29;3(10):100592.
doi: 10.1016/j.patter.2022.100592. eCollection 2022 Oct 14.

RATING: Medical knowledge-guided rheumatoid arthritis assessment from multimodal ultrasound images via deep learning

Affiliations

RATING: Medical knowledge-guided rheumatoid arthritis assessment from multimodal ultrasound images via deep learning

Zhanping Zhou et al. Patterns (N Y). .

Abstract

Multimodal ultrasound has demonstrated its power in the clinical assessment of rheumatoid arthritis (RA). However, for radiologists, it requires strong experience. In this paper, we propose a rheumatoid arthritis knowledge guided (RATING) system that automatically scores the RA activity and generates interpretable features to assist radiologists' decision-making based on deep learning. RATING leverages the complementary advantages of multimodal ultrasound images and solves the limited training data problem with self-supervised pretraining. RATING outperforms all of the existing methods, achieving an accuracy of 86.1% on a prospective test dataset and 85.0% on an external test dataset. A reader study demonstrates that the RATING system improves the average accuracy of 10 radiologists from 41.4% to 64.0%. As an assistive tool, not only can RATING indicate the possible lesions and enhance the diagnostic performance with multimodal ultrasound but it can also enlighten the road to human-machine collaboration in healthcare.

Keywords: deep learning; human-machine collaboration; multimodal ultrasound; rheumatoid arthritis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Build of the RATING system for RA scoring Paired GSUS and Doppler US images were collected for the training dataset, the prospective test dataset, and the external test dataset. Then, ROIs were annotated and scored according to the EOSS system. During model development, the models of RATING system were trained based on the ROIs of US images and the corresponding labels. During model inference, for each pair of GSUS and Doppler US image, the RATING system predicts the synovial hypertrophy score, the vascularity score, and the combined score, and the heatmaps of US images are generated. Performance evaluations were performed on the prospective test dataset and the external test dataset. When used as an assistance tool, the score predictions and heatmaps of the US images are presented to the radiologist.
Figure 2
Figure 2
The performance of the RATING system in the classification of the combined score (A) The RATING system achieved accuracy = 86.1% (95% CI = 82.5%–90.1%) on the prospective test dataset, higher than the ablation methods and the existing methods. (B) The RATING system achieved accuracy = 85.0% (95% CI = 80.5%–89.1%) on the external test dataset, higher than ablation methods and existing methods. Error bars indicate 95% confidence intervals.
Figure 3
Figure 3
Superiority of the GS-Doppler feature fusion network and MULTITUDE In each GSUS image, the boundary of the synovial hypertrophy area is annotated in orange. The numbers in green are ground truth scores. The green rectangles in the solid line stand for correct predictions, while red rectangles in the dashed line stand for incorrect predictions. (A) A sample of combined score grade 2 was underestimated as grade 1 using only the GSUS image. With the aid of synovial hypervascularity information in the Doppler US image, the RATING system made the correct prediction. (B) A sample of combined score grade 1 was underestimated as grade 0 using only the Doppler US image. With the overall morphological changes of synovial hypertrophy in the GSUS image, the RATING system made the correct prediction. (C) A sample of combined score grade 0 was incorrectly predicted as grade 1 by custom majority voting ensemble. MULTITUDE excluded the invalid score combination and led to the correct prediction.
Figure 4
Figure 4
Examples of heatmap visualization In each GSUS image, the boundary of the synovial hypertrophy area is annotated in orange, and the boundary of the joint effusion area is annotated in red. In each heatmap overlay image, the heatmap is colorized in yellow and overlaid on the original US image. (A) A sample whose synovial hypertrophy score is 1, vascularity score is 0, and combined score is 1. The joint effusion areas are highlighted in the heatmaps of both GSUS and Doppler US images. (B) A sample whose synovial hypertrophy score is 1, vascularity score is 2, and combined score is 2. The synovial hypertrophy area near the bone surface is highlighted in the GSUS image and the Doppler US image, and the blood flow areas are highlighted in the Doppler US image. (C) A sample whose synovial hypertrophy score is 2, vascularity score is 3, and combined score is 3. The synovial hypertrophy area is highlighted in the GSUS image, and the blood flow areas are highlighted in the Doppler US image. The heatmap shows what the RATING system pays attention to, which helps human radiologists understand the predictions of RATING.
Figure 5
Figure 5
The graphical user interface of the RATING system to assist radiologists for scoring RA The paired GSUS and Doppler US images are shown in the first row, and the ROI of each image is illustrated by an orange rectangle. When the radiologist clicks the button to check the predictions of the RATING system, the heatmap overlay images are presented in the second row, and the predictions appear at the bottom.
Figure 6
Figure 6
Performance comparison of the RATING system, radiologists alone, and with the assistance of the RATING system (A) With the assistance of the RATING system, radiologists (R1–R10) and the average reader achieved higher accuracy in the classification of combined score. (B–D) The Youden index of radiologists’ combined score binary classification without and with the assistance of the RATING system: 0 versus 1, 2, and 3 (B); 0 and 1 versus 2 and 3 (C); and 0, 1, and 2 versus 3 (D). Error bars indicate 95% confidence intervals.
Figure 7
Figure 7
Typical examples of incorrect predictions In each GSUS image, the boundary of the synovial hypertrophy area is annotated in orange, and the joint effusion area is annotated in red. In each heatmap overlay image, the heatmap is colorized in yellow and overlaid on the original US image. The numbers in green are ground truth scores. The green rectangle in the solid line stands for correct predictions, while the red rectangle in the dashed line stands for incorrect predictions. (A) The sample of grade 0 was incorrectly predicted as grade 1. The model correctly identified the mild synovial hypertrophy in both GSUS and Doppler US images, but overestimated it and predicted the combined score as 1. (B) The model underestimated the mild synovial hypertrophy and incorrectly predicted the sample of grade 1 as grade 0. (C) The sample of grade 1 was incorrectly predicted as grade 2. Although there is obvious synovial hypertrophy and effusion, they do not exceed the joint line across the left and right bones illustrated by the green dashed line. (D) The sample of grade 2 was incorrectly predicted as grade 3. The synovial hypertrophy score is determined by expert radiologists as 2 rather than 3 because the surface of the left synovial hypertrophy area is only slightly convex rather than obviously convex, which is just on the borderline between grades 2 and 3. The green dashed line illustrates the synovial hypertrophy surface line, which is approximately horizontal.

Similar articles

Cited by

References

    1. Atchia I., Brown A.K., Chitale S., Ciechomska A., Estrach C., Karim Z., Wakefield R.J., British Society for Rheumatology Ultrasound Special Interest Group BSRUSSIG British society for Rheumatology ultrasound special interest group (BSRUSSIG) (2021). Recommendations for rheumatology ultrasound training and practice in the UK. Rheumatology. 2021;60:2647–2652. doi: 10.1093/rheumatology/keaa656. - DOI - PubMed
    1. van Vollenhoven R. Treat-to-target in rheumatoid arthritis - are we there yet? Nat. Rev. Rheumatol. 2019;15:180–186. doi: 10.1038/s41584-019-0170-5. - DOI - PubMed
    1. Colebatch A.N., Edwards C.J., Østergaard M., van der Heijde D., Balint P.V., D'Agostino M.A., Forslind K., Grassi W., Haavardsholm E.A., Haugeberg G., et al. EULAR recommendations for the use of imaging of the joints in the clinical management of rheumatoid arthritis. Ann. Rheum. Dis. 2013;72:804–814. doi: 10.1136/annrheumdis-2012-203158. - DOI - PubMed
    1. Avramidis G.P., Avramidou M.P., Papakostas G.A. Rheumatoid arthritis diagnosis: deep learning vs. Appl. Sci. 2021;12:10. doi: 10.3390/app12010010. - DOI
    1. Gadeholt O. Forward to the past: ultrasound might be necessary in some patients with rheumatoid arthritis. Ann. Rheum. Dis. 2019;78:e56. doi: 10.1136/annrheumdis-2018-213278. - DOI - PubMed