Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May;14(5):e011951.
doi: 10.1161/CIRCIMAGING.120.011951. Epub 2021 May 17.

Automated Left Ventricular Dimension Assessment Using Artificial Intelligence Developed and Validated by a UK-Wide Collaborative

Affiliations

Automated Left Ventricular Dimension Assessment Using Artificial Intelligence Developed and Validated by a UK-Wide Collaborative

James P Howard et al. Circ Cardiovasc Imaging. 2021 May.

Abstract

Background: requires training and validation to standards expected of humans. We developed an online platform and established the Unity Collaborative to build a dataset of expertise from 17 hospitals for training, validation, and standardization of such techniques.

Methods: The training dataset consisted of 2056 individual frames drawn at random from 1265 parasternal long-axis video-loops of patients undergoing clinical echocardiography in 2015 to 2016. Nine experts labeled these images using our online platform. From this, we trained a convolutional neural network to identify keypoints. Subsequently, 13 experts labeled a validation dataset of the end-systolic and end-diastolic frame from 100 new video-loops, twice each. The 26-opinion consensus was used as the reference standard. The primary outcome was precision SD, the SD of the differences between AI measurement and expert consensus.

Results: In the validation dataset, the AI's precision SD for left ventricular internal dimension was 3.5 mm. For context, precision SD of individual expert measurements against the expert consensus was 4.4 mm. Intraclass correlation coefficient between AI and expert consensus was 0.926 (95% CI, 0.904-0.944), compared with 0.817 (0.778-0.954) between individual experts and expert consensus. For interventricular septum thickness, precision SD was 1.8 mm for AI (intraclass correlation coefficient, 0.809; 0.729-0.967), versus 2.0 mm for individuals (intraclass correlation coefficient, 0.641; 0.568-0.716). For posterior wall thickness, precision SD was 1.4 mm for AI (intraclass correlation coefficient, 0.535 [95% CI, 0.379-0.661]), versus 2.2 mm for individuals (0.366 [0.288-0.462]). We present all images and annotations. This highlights challenging cases, including poor image quality and tapered ventricles.

Conclusions: Experts at multiple institutions successfully cooperated to build a collaborative AI. This performed as well as individual experts. Future echocardiographic AI research should use a consensus of experts as a reference. Our collaborative welcomes new partners who share our commitment to publish all methods, code, annotations, and results openly.

Keywords: consensus; echocardiography; hospital; left ventricle; machine learning.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The unity interface. The unity interface (www.unityimaging.net) provides an easy-to-use web-based interface to annotate medical images. The system is divided into a labeled area (blue square) and an information area, showing that user’s statistics, compared with those of other users (red square). The 4 keypoints used in this study are highlighted as circles with their names and associated target icons for their exact location. Keypoints on echocardiograms can be labeled either using a touch screen interface or a mouse. The system also allows regions of interests and curves to be annotated (not shown).
Figure 2.
Figure 2.
System pipeline. A neural network was trained on the training set of 1894 images. One thousand five hundred fifteen of these were used to directly train the network, while 379 were used for progress-monitoring. Finally, we assessed the performance of the network on a new dataset of 200 successive echocardiograms, labeled by 13 experts.
Figure 3.
Figure 3.
Artificial intelligence (AI) performance in the context of individual expert measurements (diastolic left ventricular internal diameter [LVIDd]). A, Shows measurements by the AI (red dots) in the context of the individual expert measurements (gray dots) for all 100 validation images, arranged in order of increasing ventricular dimension (defined by expert consensus). B, Shows the cumulative distribution of deviations from expert consensus, for the AI in red and the individual experts in gray. The lower panels show deviation from the expert consensus for AI (C), and the experts (D) with each panel showing the 95% limits of agreement (horizontal lines).
Figure 4.
Figure 4.
Positions chosen by artificial intelligence (AI) and individual experts for the keypoints of LV dimension, plotted in relation to the expert consensus. The top right panel shows, for each of the 100 diastolic images in the validation dataset, the LV dimension keypoint locations chosen by the AI (coloured dots), in relation to the expert consensus keypoint locations (black line), after reorienting and rescaling so that the expert consensus LV dimension line is vertical and length 1 unit. This shows the error in the AI’s placement of keypoints is largely longitudinal along the ventricle (horizontal on the plot). Dot colors range from green (cases with the smallest variation between experts) to red (largest variation). The remaining plots display the corresponding information for individual experts (E1 to E13) and for systole (lower row). Corresponding plots are shown in the Appendix for the septum (Figure IV in the Data Supplement) and posterior wall (Figure V in the Data Supplement).
Figure 5.
Figure 5.
Nine examples of artificial intelligence (AI) measurements of left ventricular (LV) dimension, drawn from 200 frames, showing the range of AI performance, with expert consensus as the reference standard. Top: the 3 cases with the smallest AI error, Bottom: the 3 cases with the largest AI error. Middle: median cases when ranked by size of AI error, that is, showing typical performance. In each panel, AI measurements are in red, and 2×13=26 expert measurements in gray.

References

    1. Ouyang D, He B, Ghorbani A, Yuan N, Ebinger J, Langlotz CP, Heidenreich PA, Harrington RA, Liang DH, Ashley EA, et al. . Video-based AI for beat-to-beat assessment of cardiac function. Nature. 2020;580:252–256. doi: 10.1038/s41586-020-2145-8 - PMC - PubMed
    1. Zhang J, Gajjala S, Agrawal P, Tison GH, Hallock LA, Beussink-Nelson L, Lassen MH, Fan E, Aras MA, Jordan C, et al. . Fully automated echocardiogram interpretation in clinical practice. Circulation. 2018;138:1623–1635. doi: 10.1161/CIRCULATIONAHA.118.034338 - PMC - PubMed
    1. Popescu BA, Stefanidis A, Nihoyannopoulos P, Fox KF, Ray S, Cardim N, Rigo F, Badano LP, Fraser AG, Pinto F, et al. . Updated standards and processes for accreditation of echocardiographic laboratories from The European Association of Cardiovascular Imaging: an executive summary. Eur Heart J Cardiovasc Imaging. 2014;15:1188–1193. doi: 10.1093/ehjci/jeu057 - PubMed
    1. Yu KH, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng. 2018;2:719–731. doi: 10.1038/s41551-018-0305-z - PubMed
    1. Bulat A, Tzimiropoulos G. Human pose estimation via convolutional part heatmap regression. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2016. Springer Nature: vol. 9911 LNCS. Springer Verlag. 717–732. doi: 10.1007/978-3-319-46478-7_44

Publication types

LinkOut - more resources