Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct;4(10):1409-1423.
doi: 10.1038/s44161-025-00721-2. Epub 2025 Sep 26.

Quantitative proteomics of formalin-fixed, paraffin-embedded cardiac specimens uncovers protein signatures of specialized regions and patient groups

Affiliations

Quantitative proteomics of formalin-fixed, paraffin-embedded cardiac specimens uncovers protein signatures of specialized regions and patient groups

Jonathan S Achter et al. Nat Cardiovasc Res. 2025 Oct.

Abstract

Proteomic technologies have advanced our understanding of disease mechanisms, patient stratification and targeted therapies. However, applying cardiac proteomics in translational research requires overcoming the barrier of tissue accessibility. Formalin-fixed, paraffin-embedded (FFPE) heart tissue, widely preserved in pathology collections, remains a largely untapped resource. Here we demonstrate that proteomic profiles are well preserved in FFPE human heart specimens and compatible with high-resolution, quantitative analysis. Quantifying approximately 4,000 proteins per sample, we show this approach effectively distinguishes disease states and subanatomical regions, revealing distinct underlying protein signatures. Specifically, the human sinoatrial node exhibited enrichment of collagen VI and G protein-coupled receptor signaling. Myocardial biopsies from patients with arrhythmogenic cardiomyopathy were characterized by fibrosis and metabolic/cytoskeletal derangements, clearly separating them from donor heart biopsies. This study establishes FFPE heart tissue as a robust resource for cardiac proteomics, enabling retrospective molecular profiling at scale and unlocking archived specimens for disease discovery and precision cardiology.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Comprehensive proteomic profiling of cardiac FFPE tissue specimens.
Upper panel: Overview of the study approach, in which diagnostic cardiac biopsies (1) preserved as FFPE blocks in pathology archives (2) are analyzed by high-resolution mass spectrometry to quantify approximately 4,000 proteins per sample (3). Lower panel: Two example applications of this approach. Left: comparison of protein profiles between patient cases and controls to identify disease-defining patterns of protein remodeling. Right: generation of proteomic maps for distinct sub-anatomical cardiac regions to reveal spatially resolved molecular signatures.
Fig. 2
Fig. 2. Comparative proteomic analysis of FFPE and FF cardiac tissue.
a, Schematic representation of the study design. Paired FFPE and FF tissue samples were collected from the myocardium of five human hearts for proteomic evaluation (N = 5). b, Overview of experimental conditions evaluated using a TMT-multiplexed strategy, resulting in 15 samples. c, Number of proteins quantified across FFPE, FF and reference (FF ref.) workflows. d, Hierarchical clustering dendrogram of protein expression profiles. e, Pearson correlation heat map depicting pairwise correlation coefficients. The inset scatter plot illustrates representative sample correlation. f, PCA of the 15 samples based on their protein expression profiles. The PC1 and PC2 account for 46.5% and 15% of the variance, respectively. Ellipses indicate 90% confidence intervals for each group. g, Variance decomposition of all quantified proteins, showing contributions of fixation, individual differences, workflow and residual variability. Box plots depict medians (center lines), interquartile ranges (IQR; boxes) and values within 1.5× the IQR (whiskers); points outside this range are plotted. h, Workflow consistency analysis of FFPE samples (N = 15; three sections per five FFPE blocks): (i) schematic of sectioning and (ii) discriminant analysis of resulting proteome profiles, color coded per individual. Squares mark the centroids of replicate sections. Parts of panel a adapted from Servier Medical Art (https://smart.servier.com/) under a Creative Commons license CC BY 4.0. Source data
Fig. 3
Fig. 3. Evaluation of DIA-MS acquisition strategies using in silico and DDA-based libraries for proteomic profiling of FFPE heart tissue.
a, Schematic workflow: (i) Tissue sampling and preparation from FFPE cardiac specimens of five individuals (N = 5). For each FFPE block, four consecutive sections were analyzed as technical replicates. (ii) DIA-MS analysis using two gradient lengths (45 min and 101 min), with comparison of data processing using a DDA-based spectral library and an in silico predicted library. b, The total number of precursors and protein groups stratified by gradient length and library. c, PCA illustrating clustering of samples for each library type and gradient length. PC1 explaining ~45–47% of the variability and PC2 accounting for ~16–18%. d, Hierarchical clustering dendrogram of quantified proteomes highlights consistent grouping by individual. e, Box plots showing the CV of all protein intensities across technical replicates from five individuals, for each condition. Box plots depict medians (center lines), IQR (boxes) and values within 1.5× the IQR (whiskers). Parts of panel a adapted from Servier Medical Art (https://smart.servier.com/) under a Creative Commons license CC BY 4.0. Source data
Fig. 4
Fig. 4. Proteomic profiling of the human SAN.
a, (i) Overview of donor numbers (N = 4) and regions included in the study. (ii) Annotated H&E-stained image highlighting the SAN region (blue outline) used to guide tissue dissection. Magnified views show the pacemaker region (purple) and non-pacemaker region (green). ENDO, endocardium; EPI, epicardium. (iii) Proteomics data were generated from the SAN and RA regions on the timsTOF HT. b, Number of unique proteins quantified for each individual sample, with a horizontal line indicating the mean across samples. c, Clustered heatmap of pairwise Pearson correlation coefficients across sample from both regions. d, PCA of quantitative proteomics data. Ellipses represent 90% confidence regions for each group, and dashed lines connect samples originating from the same donor. e, Relative enrichment of cell-type markers among proteins with higher abundance in either the SAN or RA region. Enrichment scores were derived using GSEA. P values were adjusted for multiple testing using the Benjamini–Hochberg method. f, Functional enrichment of cellular components showing a clustered network based on term similarity of core enriched genes. g, Volcano plot representation of differentially expressed proteins between SAN and RA, assessed using empirical Bayes-moderated t-statistics with standard errors adjusted for within-donor correlation. Two-sided P values were Benjamini–Hochberg corrected. Dashed lines indicate significant up-regulation (log2FC > 1 and adjusted P < 0.05) or down-regulation (log2FC < −1 and adjusted P < 0.05). h, Protein–protein interaction network for selected pathways overrepresented among proteins with higher abundance in the SAN region. Adjusted P value codes: *P = 0.05; **P = 0.01; ***P = 0.001; exact P values are provided in the source data. Parts of panel a adapted from Servier Medical Art (https://smart.servier.com/) under a Creative Commons license CC BY 4.0. Source data
Fig. 5
Fig. 5. Expanded proteome coverage of the human SAN proteome by off-line high pH fractionation.
a, Schematic workflow illustrating the reprocessing of SAN samples from four individuals. Peptides from individual preparations were pooled for off-line high-pH fractionation before dda-PASEF acquisition. b, Cumulative number of modified peptides (top) and proteins (bottom) identified across 16 fractions, with individual bars representing per-fraction identifications. c, Comparison of total unique protein identifications between the fractionated approach and single-shot DIA measurements of the four individual SAN preparations. d, Overlap analysis of proteins identified by both methods. e, Protein intensity distribution from the fractionated dataset, highlighting proteins also found in single-injection DIA and membrane-annotated proteins. f, Ranked protein abundance from the deep SAN proteome, with uniquely identified ion channels/receptors highlighted by a black outline. Source data
Fig. 6
Fig. 6. Proteomic profiling of 20 human myocardial biopsies from patients with ACM and donor hearts.
a, (i) Overview of the patient cohort included in the study. (ii) Deep proteomics data were generated from patients with ACM (N = 10) and donor hearts (N = 10) on the timsTOF HT in dia-PASEF mode. b, Number of unique proteins quantified from each individual biopsy. c, Dimensionality reduction by PCA. Left: Representation of samples along the first two PCs; 90% confidence intervals for ACM and donor groups are depicted as ellipses. Arrows point at the centroid of the respective clusters. Right: Box plots showing separation of samples along PC2. Boxes indicate the IQR, center lines mark medians, and whiskers extend to 1.5× the IQR. d, Protein loadings along PC1 and PC2. Proteins aligning with the direction of the vectors separating ACM and donor groups are highlighted. e, Bar graph representation of cell-type marker enrichment results. NESs are shown; P values were obtained by permutation testing and adjusted for multiple testing using the Benjamini–Hochberg method. f, Volcano plot representation of differentially expressed proteins between ACM and donor tissues, assessed using empirical Bayes-moderated t-statistics. Two-sided P values were Benjamini–Hochberg adjusted. Dashed lines indicating significant up-regulation (log2FC > 0.5 and Benjamini–Hochberg-adjusted P < 0.05) or down-regulation (log2FC < −0.5 and Benjamini–Hochberg-adjusted P < 0.05). g, Heat map of z-scored protein intensities for differentially expressed proteins. The alluvial plot links upregulated (higher abundance in ACM) and downregulated (lower abundance in ACM) proteins to their likely cell-type origins. h, Quantification of fibro-fatty myocardial replacement in ACM hearts. (i) Schematic illustrating the stereological point counting method used to quantify tissue composition from H&E-stained sections by categorizing each grid point as fibrosis, adipose tissue or myocardium. (ii) Ternary plot showing the relative tissue compositions in biopsies from ACM (purple) and donor (green) hearts. A zoom-in panel highlights the donor cluster for improved visibility. Dashed line indicates the 75% confidence interval. Adjusted P value codes: *P = 0.05; **P = 0.01; exact P values are provided in the source data. Parts of panel a adapted from Servier Medical Art (https://smart.servier.com/) under a Creative Commons license CC BY 4.0. Source data
Extended Data Fig. 1
Extended Data Fig. 1. TMT-based proteome profiling of FFPE and FF cardiac tissue.
(a) Schematic of the workflow used for proteome analysis of FFPE and FF cardiac tissues. Proteins were extracted and digested using the FFPE-optimized and reference workflows, followed by TMT labeling and peptide separation by high-pH reversed-phase liquid chromatography (RP-LC) off-line fractionation. Examples of MS1 precursor and MS2 fragment spectra illustrate the detection and quantification of reporter ions. (b) Peptide identifications in fractions. Left: Distribution of identified peptides across all fractions. Right: Histogram showing the number of fractions in which peptides were detected. (c) Cumulative protein abundance. Proteins are ranked by summed TMT reporter ion intensities, with protein counts indicated for each quartile (Q1–Q4). (d) Boxplot representation of protein intensities across all multiplexed samples, showing the median (center line), interquartile range (box), and whiskers extending to 1.5× the IQR. Outliers are shown as individual points. (e) Intersection with a reference dataset. Left: UpSet plot depicting the overlap of proteins quantified by at least two peptide spectrum matches (PSMs) in this study with those reported by Buczak et al. Right: Scatter plot of PSM counts per protein, comparing results from this study with those from Buczak et al. Spearman correlation coefficient (R; two-sided p-value). Source data
Extended Data Fig. 2
Extended Data Fig. 2. Workflow-dependent proteomic variation in FFPE and reference cardiac samples.
(a) Clustering of positively enriched gene sets along PC1, based on term similarity, reveals distinct functional themes. (b) Enrichment plots for gene sets within the plasma membrane cluster. (c) PCA loadings. Top: Proteins core-enriched in the plasma membrane cluster are highlighted in the PC1-PC2 plot. Bottom: Representative boxplots showing abundance of CHRM2 (left) and ITGB1 (right). Boxplots are based on N=5 biological replicates per group and display the median (center line), interquartile range (box), and whiskers extending to 1.5× the IQR; measurements are shown as individual points. (d) UpSet plot showing the intersection between proteins identified in this study and those in the cardiac surfaceome database. (e) PC1 loadings of proteins present in the surfaceome database versus all other proteins. Each boxplot displays the median (center line), interquartile range (box), and whiskers extending to 1.5× the interquartile range (IQR). Outliers beyond the whiskers are shown as individual points. Statistical significance was assessed using a two-sided Wilcoxon rank-sum test (P = 1.39 × 10−17). Source data
Extended Data Fig. 3
Extended Data Fig. 3. Workflow and analysis of dda- and dia-PASEF raw data.
(a) i: Base peak chromatograms for the 101 min and 45 min gradients. ii: Acquisition scheme overlaid onto a scatterplot illustrating the m/z-ion mobility dimension. (b) Schematic workflow for DDA library generation. Peptides pooled from all patients were fractionated at high pH and subjected to dda-PASEF measurements to generate the experimental spectral library. (c) Schematic representation of dia-PASEF data analysis. i: Databases included the experimental spectral library (DDA library) and a predicted spectral library generated from the sequence database. ii: DIA data were analyzed using both libraries with match-between-runs (MBR) enabled for quantification. (d) Upset plot showing protein identifications across the two gradient durations and spectral library combinations. (e) Density plots of log2-transformed protein intensities. Source data
Extended Data Fig. 4
Extended Data Fig. 4. Comparison of proteome profiles from FFPE and matched fresh frozen heart tissue by label-free dia-PASEF.
(a) Principal component analysis of protein intensities from five individuals, with 3–4 replicates per individual for FFPE and 3 replicates for FF samples. (b) Bar plots showing the proportion of variance explained by the top 15 principal components. (c) Violin plots showing the distribution of coefficients of variation (CV) across replicates for each individual. Embedded boxplots display the median (center line), interquartile range (box), and whiskers extending to 1.5× the IQR. The red diamond indicates the mean. Source data
Extended Data Fig. 5
Extended Data Fig. 5. Quality control of proteomes from human sinoatrial node (SAN) and right atrium (RA) regions.
(a) Overlap of proteins identified in the SAN and RA region. (b) Density plots of log2-transformed protein intensities before (left) and after (right) imputation of missing values. (c) Bar plot showing the fraction of proteins affected by imputation, stratified by region. (d) Variance partition plot showing the relative contributions of interindividual differences and regional differences to total proteomic variance. Boxplots indicating the median (center line), interquartile range (box), and whiskers extending to 1.5× the interquartile range (IQR). (e) Dendrogram of cell types based on average RNA expression profiles derived from single-nucleus RNA sequencing (snRNA-seq) data. (f) Hierarchical cluster heatmap of z-scored average expression levels of cell-type marker genes across clusters. (g) Representative Gene Set Enrichment Analysis (GSEA) plot showing enrichment of atrial cardiomyocyte markers toward the end of the ranked list (RA). Source data
Extended Data Fig. 6
Extended Data Fig. 6. Validation of ECM and GPCR-related signatures in the human SAN.
(a) Immunohistochemical validation of COL6 enrichment in the SAN. (i) Representative image of COL6 staining in FFPE heart section, with the SAN region outlined. Zoom-in panels on SAN and adjacent RA regions. (ii) Quantification of COL6 staining intensity (N=4 biological replicates) confirms significantly elevated expression in the SAN (two-sided paired t-test, p = 0.039). (b) Transcript abundance of COL6 family genes in the SAN versus RA assessed using a published spatial transcriptomics dataset. Boxplots displaying the median (center line), interquartile range (box), and whiskers extending to 1.5× the interquartile range (IQR). Source data
Extended Data Fig. 7
Extended Data Fig. 7. Proteomic quality control and covariate adjustment in arrhythmogenic cardiomyopathy (ACM) and donor heart samples.
(a) Histogram of log2-transformed protein intensities across 10 ACM and 10 donor samples. (b) Clustered heatmap of pairwise Pearson correlation coefficients between all samples. (c) Cell type marker detection. Upper panel: Hierarchical cluster heatmap of z-scored average RNA expression per cell type. Lower panel: Hierarchical cluster heatmap of z-scored average expression levels of cell-type marker genes across clusters. (d) Effect of covariate adjustment on differential protein abundance comparisons between ACM and donor samples. Scatter plots compare t-statistics from unadjusted models (x-axis) to those from models adjusted for sex (left) or age (right) (y-axis). Pearson correlation coefficients (R) quantify the consistency between models. P-values are two-sided. Source data
Extended Data Fig. 8
Extended Data Fig. 8. Reproducibility of differential protein abundance in ACM cases versus an expanded donor cohort.
Overview of study design: Proteomic profiling of an expanded set of donor hearts (n = 38) was performed to evaluate the reproducibility of findings from the initial ACM versus donor comparison (Fig. 6). (a) Number of uniquely identified proteins across the 38 donor samples. (b) Correlation between the number of identified proteins per sample and the storage time of the corresponding FFPE blocks. A linear regression line is shown, with the shaded area representing the 95% confidence interval. The Pearson correlation coefficient (R) and its associated two-sided p-value are displayed. (c) Comparison of log2 fold changes between the discovery cohort (ACM vs. 10 donors; y-axis) and the reproduction cohort (ACM vs. 48 donors; x-axis). Selected consistently upregulated proteins are annotated. The Pearson correlation coefficient (P = 4.3 × 10−31, two-sided) is displayed. Source data

References

    1. Aragam, K. G. et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat. Genet.54, 1803–1815 (2022). - DOI - PMC - PubMed
    1. Roselli, C., Rienstra, M. & Ellinor, P. T. Genetics of atrial fibrillation in 2020: GWAS, genome sequencing, polygenic risk, and beyond. Circ. Res.127, 21–33 (2020). - DOI - PMC - PubMed
    1. Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet.50, 1219–1224 (2018). - DOI - PMC - PubMed
    1. Suhre, K., McCarthy, M. I. & Schwenk, J. M. Genetics meets proteomics: perspectives for large population-based studies. Nat. Rev. Genet.22, 19–37 (2021). - DOI - PubMed
    1. Zagorac, I. et al. In vivo phosphoproteomics reveals kinase activity profiles that predict treatment outcome in triple-negative breast cancer. Nat. Commun.9, 3501 (2018). - DOI - PMC - PubMed

LinkOut - more resources