Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 7;15(12):e17745.
doi: 10.15252/emmm.202317745. Epub 2023 Oct 16.

Proteome profiling of early gestational plasma reveals novel biomarkers of congenital heart disease

Affiliations

Proteome profiling of early gestational plasma reveals novel biomarkers of congenital heart disease

Ya-Nan Yin et al. EMBO Mol Med. .

Abstract

Prenatal diagnosis of congenital heart disease (CHD) relies primarily on fetal echocardiography conducted at mid-gestational age-the sensitivity of which varies among centers and practitioners. An objective method for early diagnosis is needed. Here, we conducted a case-control study recruiting 103 pregnant women with healthy offspring and 104 cases with CHD offspring, including VSD (42/104), ASD (20/104), and other CHD phenotypes. Plasma was collected during the first trimester and proteomic analysis was performed. Principal component analysis revealed considerable differences between the controls and the CHDs. Among the significantly altered proteins, 25 upregulated proteins in CHDs were enriched in amino acid metabolism, extracellular matrix receptor, and actin skeleton regulation, whereas 49 downregulated proteins were enriched in carbohydrate metabolism, cardiac muscle contraction, and cardiomyopathy. The machine learning model reached an area under the curve of 0.964 and was highly accurate in recognizing CHDs. This study provides a highly valuable proteomics resource to better recognize the cause of CHD and has developed a reliable objective method for the early recognition of CHD, facilitating early intervention and better prognosis.

Keywords: congenital heart disease; plasma; proteomics gestational.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1. Research overview and proteomic characterization of maternal plasma
  1. Overview of the investigated groups and a schematic diagram of the proteomic analysis. Maternal blood samples were collected at 10–12 weeks of gestation. The total number of participants in each group and the basic procedures for proteomic analysis are described.

  2. Proteins were quantified using a 1% false discovery rate (FDR) cutoff. Values are reported as mean ± standard deviation (SD). n = 71 (control in group 1), 67 (CHD offspring in group 1), 32 (control in group 2), and 37 (CHD offspring in group 2) as biological replicates.

  3. Cumulative number of proteins identified (the left panel shows the results for control and the right panel shows the results for CHD) in group 1. The number of proteins in the dataset (Y‐axis) was plotted against the number of samples (X‐axis).

  4. Cumulative number of proteins identified (the left panel shows the results for the control and the right panel shows the results for CHD) in group 2. The number of proteins in the dataset (Y‐axis) was plotted against the number of samples (X‐axis).

  5. Data completeness curve. The number of proteins in the dataset (Y‐axis) is plotted against the minimum number of samples in which the proteins were quantified (X‐axis). Arrows indicate data completeness values of 50, 75, and 100%.

  6. The protein abundance distributions in the CHD group (red) and the control group (blue) are plotted. The top 10 most abundant proteins are indicated in the box.

Source data are available online for this figure.
Figure EV1
Figure EV1. Reproducibility of plasma data
  1. Pearson's correlation coefficients for replicate proteome profiling of 20 plasma samples (10 CHD and 10 healthy control samples).

  2. Reproducibility of the fraction of total (FOT) of six proteins in 207 samples. FOT was defined as the iBAQ of a protein divided by the total iBAQ of all identified proteins within a sample.

Figure 2
Figure 2. Differences in CHD versus control plasma proteome in the two groups
  1. Principal component analysis (PCA) of proteins in plasma samples from group 1. Control participants are represented in red and CHD in blue.

  2. Relationship between fold‐change values for CHD/control samples and statistical significance for group 1. Red indicates upregulated proteins, blue indicates downregulated proteins, and proteins above the gray dotted line are statistically significant (P < 0.05).

  3. PCA of plasma proteins from group 2. Control participants are represented in red, and CHD in blue.

  4. Relationship between fold‐change values for CHD/control samples and statistical significance for group 2. Red indicates upregulated proteins, blue indicates downregulated proteins, and proteins above the gray dotted line are statistically significant (P < 0.05).

  5. Gene ontology (GO) annotations with upregulated or downregulated proteins (P < 0.05).

  6. Protein interaction analysis of enriched pathways in the two groups is shown in (E). Red indicates interactions between upregulated proteins, whereas blue indicates interactions between downregulated proteins.

Source data are available online for this figure.
Figure EV2
Figure EV2. Protein levels in serum from pregnant women with CHD or normal offspring
Protein levels of G6PD, MYL9, and SHMT1 in serum from both control and case groups were detected by western blotting. CBB, Coomass.
Figure 3
Figure 3. Plasma proteome alterations in the two groups
  1. Expression levels of previously reported heart disease‐related proteins. ***P < 0.001; **P < 0.01; *P < 0.05; P‐values from unpaired t‐test are shown. Lines indicate mean and SD. n = 71 (control in group 1), 67 (CHD offspring in group 1), 32 (control in group 2), and 37 (CHD offspring in group 2) as biological replicates.

  2. Venn diagram of up‐ and downregulated proteins.

  3. Expression levels and pathway enrichment of identified proteins in both groups. The heatmaps indicate the expression levels, frequencies, and P‐values for 25 upregulated proteins in the two groups (upper panel) and 49 downregulated proteins in the two groups (lower panel).

  4. Pathway patterns of differentially expressed proteins in the two groups.

Source data are available online for this figure.
Figure 4
Figure 4. Relationship between CHD plasma proteome and clinical phenotypes and indicators
  1. Weighted gene co‐expression network analysis (WGCNA) of 207 plasma samples showed that 12 pathological features of CHD could be integrated into three clusters according to the correlations between module proteins. The strengths of the positive (red) and negative (blue) correlations are illustrated in the two‐color heatmap. Pearson correlation coefficients and P‐values were calculated by WGCNA package. ***P < 0.001; **P < 0.01; *P < 0.05.

  2. Gene ontology (GO) analysis of proteins in each cluster.

  3. Protein–protein interaction analysis of three major pathological features.

  4. Analysis of changes in different CHD protein networks and their correlation with clinical indicators.

  5. Expression levels of blood lipids in the controls and CHDs. ***P < 0.001; **P < 0.01; *P < 0.05; P‐values from unpaired t‐test are shown. Whiskers mark minimum or maximum values. n = 103 (Control), 10 (Cluster 1), 23 (Cluster 2), and 71 (Cluster 3) as biological replicates.

  6. Correlations of protein expression with clinical indicators (blood lipids). ***P < 0.001; *P < 0.05; P‐values from unpaired t‐test are shown. Lines indicate mean and SD. n = 103 (Control), 10 (Cluster 1), 23 (Cluster 2), and 71 (Cluster 3) as biological replicates.

  7. Heatmap of cluster types of specific immune cells in the controls and CHDs.

  8. Scatter plot illustrating the xCell scores for specific cell types in the control and CHDs groups. ***P < 0.001; **P < 0.01; *P < 0.05; P‐values from unpaired t‐test are shown. Lines indicate mean and SD. n = 103 (Control), 10 (Cluster 1), 23 (Cluster 2), and 71 (Cluster 3) as biological replicates.

  9. Blood lipids and the potential pathogenesis of CHD.

Source data are available online for this figure.
Figure 5
Figure 5. Exploiting machine learning for the development of biomarker combinations to predict CHD
  1. A–C

    (A) The receiver operating characteristic (ROC) curve for the training dataset of group 1, (B) Confusion matrix for the combination biomarkers in the training dataset, and (C) principal component analysis (PCA) plot for the prediction of CHD and control outcomes.

  2. D–F

    (D) The ROC curve for the test dataset of group 1, (E) confusion matrix for the combination biomarkers in the test dataset, and (F) PCA plot for the prediction of CHD and control outcomes.

  3. G–I

    (G) The ROC curve for the external validation set in group 2, (H) confusion matrix for the combination biomarkers in the external validation set, and (I) PCA plot for the prediction of CHD and control outcomes.

Figure EV3
Figure EV3. CHD diagnostic performance of nine candidate biomarkers
  1. The receiver operating characteristic (ROC) curve of protein calpain‐5 (CAPN5) in the training set, test set, and validation set.

  2. The ROC curve of protein enolase‐phosphatase E1 (ENOPH1) in the training set, test set, and validation set.

  3. The ROC curve of protein histone H2A type 1‐C (H2AC6) in the training set, test set, and validation set.

  4. The ROC curve of protein heat shock protein HSP 90‐alpha (HSP90AA1) in the training set, test set, and validation set.

  5. The ROC curve of protein importin subunit beta‐1 (KPNB1) in the training set, test set, and validation set.

  6. The ROC curve of protein malate dehydrogenase (MDH2) in the training set, test set, and validation set.

  7. The ROC curve of protein myosin regulatory light polypeptide 9 (MYL9) in the training set, test set, and validation set.

  8. The ROC curve of protein radixin (RDX) in the training set, test set, and validation set.

  9. The ROC curve of protein deoxynucleoside triphosphate triphosphohydrolase 1 (SAMHD1) in the training set, test set, and validation set.

References

    1. Alanen J, Korpimaki T, Kouru H, Sairanen M, Leskinen M, Gissler M, Ryynanen M, Nevalainen J (2019) First trimester combined screening biochemistry in detection of congenital heart defects. J Matern Fetal Neonatal Med 32: 3272–3277 - PubMed
    1. Allan L (2000) Antenatal diagnosis of heart disease. Heart 83: 367 - PMC - PubMed
    1. Aran D, Hu Z, Butte AJ (2017) xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol 18: 220 - PMC - PubMed
    1. Bahado‐Singh R, Vishweswaraiah S, Mishra NK, Guda C, Radhakrishna U (2020) Placental DNA methylation changes in detection of tetralogy of Fallot. Ultrasound Obstet Gynecol 55: 768–775 - PubMed
    1. Bogachkov YY, Chen L, Le Master E, Fancher IS, Zhao Y, Aguilar V, Oh MJ, Wary KK, DiPietro LA, Levitan I (2020) LDL induces cholesterol loading and inhibits endothelial proliferation and angiogenesis in Matrigels: correlation with impaired angiogenesis during wound healing. Am J Physiol Cell Physiol 318: C762–C776 - PMC - PubMed

Publication types