Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 22;3(2):153-168.
doi: 10.1093/ehjdh/ztac004. eCollection 2022 Jun.

Harnessing feature extraction capacities from a pre-trained convolutional neural network (VGG-16) for the unsupervised distinction of aortic outflow velocity profiles in patients with severe aortic stenosis

Affiliations

Harnessing feature extraction capacities from a pre-trained convolutional neural network (VGG-16) for the unsupervised distinction of aortic outflow velocity profiles in patients with severe aortic stenosis

Mark Lachmann et al. Eur Heart J Digit Health. .

Abstract

Aims: Hypothesizing that aortic outflow velocity profiles contain more valuable information about aortic valve obstruction and left ventricular contractility than can be captured by the human eye, features of the complex geometry of Doppler tracings from patients with severe aortic stenosis (AS) were extracted by a convolutional neural network (CNN).

Methods and results: After pre-training a CNN (VGG-16) on a large data set (ImageNet data set; 14 million images belonging to 1000 classes), the convolutional part was employed to transform Doppler tracings to 1D arrays. Among 366 eligible patients [age: 79.8 ± 6.77 years; 146 (39.9%) women] with pre-procedural echocardiography and right heart catheterization prior to transcatheter aortic valve replacement (TAVR), good quality Doppler tracings from 101 patients were analysed. The convolutional part of the pre-trained VGG-16 model in conjunction with principal component analysis and k-means clustering distinguished two shapes of aortic outflow velocity profiles. Kaplan-Meier analysis revealed that mortality in patients from Cluster 2 (n = 40, 39.6%) was significantly increased [hazard ratio (HR) for 2-year mortality: 3; 95% confidence interval (CI): 1-8.9]. Apart from reduced cardiac output and mean aortic valve gradient, patients from Cluster 2 were also characterized by signs of pulmonary hypertension, impaired right ventricular function, and right atrial enlargement. After training an extreme gradient boosting algorithm on these 101 patients, validation on the remaining 265 patients confirmed that patients assigned to Cluster 2 show increased mortality (HR for 2-year mortality: 2.6; 95% CI: 1.4-5.1, P-value: 0.004).

Conclusion: Transfer learning enables sophisticated pattern recognition even in clinical data sets of limited size. Importantly, it is the left ventricular compensation capacity in the face of increased afterload, and not so much the actual obstruction of the aortic valve, that determines fate after TAVR.

Keywords: Aortic outflow velocity profile; Convolutional neural network; Severe aortic stenosis; Transcatheter aortic valve replacement; Transfer learning.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
Figure 1
Figure 1
General information about the study population from recruitment to follow-up. (A) A flowchart for patient recruitment in order to select 101 patients with best quality Doppler tracings. Notably, 56 out of 366 patients (15.3%) had no Doppler tracings as raw data available. (B) Kaplan–Meier survival plot testing for differences in survival between derivation and validation cohorts. RHC, right heart catheterization; TAVR, transcatheter aortic valve replacement.
Figure 2
Figure 2
Pre-processing of aortic outflow velocity profiles from patients with severe AS data input for the convolutional neural network. (A) Schematic of the image pre-processing pipeline. One representative aortic outflow velocity profile per patient was extracted from records. Cropping of the region of interest, i.e. during systole, was done manually. Since original echocardiographic images were recorded at different scales, but homogeneity of data input had to be provided, Doppler tracings were further manually scaled according to uniform time and velocity axes (see also Supplementary material online, Figure S1 for a standard operation procedure explaining additional details to create the desired normalized profiles). Neither image normalization nor histogram equalization was applied during pre-processing. Re-sizing to 224 × 224 pixel format as the default input size of the VGG-16 model was already part of the image processing R code after loading the folder with 101 scaled Doppler tracings. (B) Representative Doppler tracings that were excluded due to insufficient contrast, or due to labelling within the aortic outflow velocity profile.
Figure 3
Figure 3
A convolutional neural network followed by PCA and unsupervised k-means clustering provides the proof-of-principle that two subgroups of patients with severe AS can be distinguished according to the aortic outflow velocity profile. (A) VGG-16 network architecture (schematic). The VGG-16 network can be split into two parts: 13 convolutional layers constitute the first part, through which each image is passed through for feature extraction. The convolutional layers are followed by three fully connected layers for classification, and the last layer uses a softmax activation function for final class prediction. Since the aortic outflow velocity profiles were no established class within the ImageNet data set, the classification part of VGG-16 was omitted after pre-training, and hence only the model’s feature extraction capacity was exploited in order to transform aortic outflow velocity profiles to 1D arrays (flatten layer), which were subsequently used for unsupervised clustering. (B) PCA of 1D arrays from 101 aortic outflow velocity profiles. (C) Scatter plot including 95% confidence ellipse in order to illustrate cardiac output and mean aortic valve gradient in accordance with cluster assignment. (D) Kaplan–Meier survival analysis in accordance with cluster assignment. (E) Bee swarm plots for comparison of baseline echocardiographic and haemodynamic data. (F) Representative aortic outflow velocity profiles in accordance with cluster assignment. AVA, aortic valve area; AVGmean, mean aortic valve gradient; LA area, left atrial area; LVEDD, left ventricular end-diastolic diameter; mPAP, mean pulmonary artery pressure; RA area, right atrial area; ReLU, Rectified Linear Unit; TAPSE, tricuspid annular plane systolic excursion.
Figure 3
Figure 3
(Continued)
Figure 4
Figure 4
Conventional dichotomization of the study population in accordance with elevation in mean aortic valve gradient. (A) Scatter plot illustrating cardiac output and mean aortic valve gradient after dichotomization in accordance with elevation in mean aortic valve gradient. (B) Kaplan–Meier survival analysis in accordance with elevation in mean aortic valve gradient. AVGmean, mean aortic valve gradient.
Figure 5
Figure 5
A flowchart illustrating the application of SMOTE to create a balanced data set for training of the extreme gradient boosting algorithm. CNN, convolutional neural network; SMOTE, synthetic minority over-sampling technique; XGB algorithm, extreme gradient boosting algorithm.
Figure 6
Figure 6
An extreme gradient boosting algorithm opens the perspective to assign patients to beforehand defined clusters by a comprehensive set of functional and structural parameters of cardiac and pulmonary circulatory conditions. (A) Bee swarm plots for comparison of key characteristics between clusters after over-sampling (SMOTE). (B) Confusion matrix (test set). (C) Shedding light on the black box of extreme gradient boosting algorithm-mediated cluster assignment by calculating SHAP (SHapley Additive exPlanations) values for its input variables. The y-axis represents the input variables in descending order of global feature importance, whilst the x-axis indicates the adjustment to the predicted cluster. Moreover, each dot in this sina plot represents an observation, i.e. a patient from the derivation cohort, and the gradient colour denotes the value of the respective input variable. Therefore, if the dots on one side of the central line are increasingly yellow or purple, that suggests that increasing values or decreasing values, respectively, move the predicted cluster in the respective direction (left: Cluster 1; right: Cluster 2). For instance, higher values of AVGmean (purple dots) are associated with assignment to Cluster 1. (D) Kaplan–Meier survival analysis in accordance with extreme gradient boosting-algorithm-mediated cluster assignment (validation cohort). (E) Comparison of clusters as defined by the CNN in conjunction with PCA and k-means clustering (derivation cohort; red) or as determined by the trained extreme gradient boosting algorithm (validation cohort; blue). The central line in each box plot denotes the median value, while the box contains all values ranging between the 25th and 75th percentiles of the data set. The black whiskers mark the 5th and 95th percentiles, and values falling beyond these upper and lower bounds are considered outliers, plotted as black dots. AVA, aortic valve area; AVGmean, mean aortic valve gradient; LA area, left atrial area; LVEDD, left ventricular end-diastolic diameter; mPAP, mean pulmonary artery pressure; mPCWP, mean postcapillary wedge pressure; PVR, pulmonary vascular resistance; RA area, right atrial area; RA pressure, right atrial pressure; RV pressuremean, mean right ventricular pressure; TAPSE, tricuspid annular plane systolic excursion.
Figure 6
Figure 6
An extreme gradient boosting algorithm opens the perspective to assign patients to beforehand defined clusters by a comprehensive set of functional and structural parameters of cardiac and pulmonary circulatory conditions. (A) Bee swarm plots for comparison of key characteristics between clusters after over-sampling (SMOTE). (B) Confusion matrix (test set). (C) Shedding light on the black box of extreme gradient boosting algorithm-mediated cluster assignment by calculating SHAP (SHapley Additive exPlanations) values for its input variables. The y-axis represents the input variables in descending order of global feature importance, whilst the x-axis indicates the adjustment to the predicted cluster. Moreover, each dot in this sina plot represents an observation, i.e. a patient from the derivation cohort, and the gradient colour denotes the value of the respective input variable. Therefore, if the dots on one side of the central line are increasingly yellow or purple, that suggests that increasing values or decreasing values, respectively, move the predicted cluster in the respective direction (left: Cluster 1; right: Cluster 2). For instance, higher values of AVGmean (purple dots) are associated with assignment to Cluster 1. (D) Kaplan–Meier survival analysis in accordance with extreme gradient boosting-algorithm-mediated cluster assignment (validation cohort). (E) Comparison of clusters as defined by the CNN in conjunction with PCA and k-means clustering (derivation cohort; red) or as determined by the trained extreme gradient boosting algorithm (validation cohort; blue). The central line in each box plot denotes the median value, while the box contains all values ranging between the 25th and 75th percentiles of the data set. The black whiskers mark the 5th and 95th percentiles, and values falling beyond these upper and lower bounds are considered outliers, plotted as black dots. AVA, aortic valve area; AVGmean, mean aortic valve gradient; LA area, left atrial area; LVEDD, left ventricular end-diastolic diameter; mPAP, mean pulmonary artery pressure; mPCWP, mean postcapillary wedge pressure; PVR, pulmonary vascular resistance; RA area, right atrial area; RA pressure, right atrial pressure; RV pressuremean, mean right ventricular pressure; TAPSE, tricuspid annular plane systolic excursion.

References

    1. Raghunath S, Ulloa Cerna AE, Jing L, et al. Prediction of mortality from 12-lead electrocardiogram voltage data using a deep neural network. Nat Med 2020;26:886–891. - PubMed
    1. Fries JA, Varma P, Chen VS, et al. Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences. Nat Commun 2019;10:3111. - PMC - PubMed
    1. Diller G-P, Kempny A, Babu-Narayan SV, et al. Machine learning algorithms estimating prognosis and guiding therapy in adult congenital heart disease: data from a single tertiary centre including 10 019 patients. Eur Heart J 2019;40:1069–1077. - PMC - PubMed
    1. Perez MV, Mahaffey KW, Hedlin H, et al. ; Apple Heart Study Investigators . Large-scale assessment of a smartwatch to identify atrial fibrillation. N Engl J Med 2019;381:1909–1917. - PMC - PubMed
    1. Kwak S, Lee Y, Ko T, et al. Unsupervised cluster analysis of patients with aortic stenosis reveals distinct population with different phenotypes and outcomes. Circ Cardiovasc Imaging 2020;13. - PubMed

LinkOut - more resources