Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 7;20(5):2983-3001.
doi: 10.1021/acs.jproteome.1c00243. Epub 2021 Apr 15.

Proteogenomic Workflow Reveals Molecular Phenotypes Related to Breast Cancer Mammographic Appearance

Affiliations

Proteogenomic Workflow Reveals Molecular Phenotypes Related to Breast Cancer Mammographic Appearance

Tommaso De Marchi et al. J Proteome Res. .

Abstract

Proteogenomic approaches have enabled the generat̲ion of novel information levels when compared to single omics studies although burdened by extensive experimental efforts. Here, we improved a data-independent acquisition mass spectrometry proteogenomic workflow to reveal distinct molecular features related to mammographic appearances in breast cancer. Our results reveal splicing processes detectable at the protein level and highlight quantitation and pathway complementarity between RNA and protein data. Furthermore, we confirm previously detected enrichments of molecular pathways associated with estrogen receptor-dependent activity and provide novel evidence of epithelial-to-mesenchymal activity in mammography-detected spiculated tumors. Several transcript-protein pairs displayed radically different abundances depending on the overall clinical properties of the tumor. These results demonstrate that there are differentially regulated protein networks in clinically relevant tumor subgroups, which in turn alter both cancer biology and the abundance of biomarker candidates and drug targets.

Keywords: breast cancer; data-independent acquisition; proteogenomics; proteomics; transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Experimental workflow of this study. A total of 21 samples derived from a larger cohort (set 1, N = 172, see Experimental Procedures) and a second set of 24 tumors from a larger study (set 2, N = 109, see Experimental Procedures) were employed (A). Panel (B) shows examples of nonspiculated and spiculated tumor masses. Panel (C) displays the overlap between the molecular (ER status) and appearance features evaluated in this study, for which no association was found (set 1: Fisher exact p-value = 0.665, set 2: Fisher exact p-value = 0.283, (D)). Tumor specimens were processed as whole tissue lysates (WTL, MS-only analysis) and ALLPREP flow-throughs (FT, RNA-seq and MS analyses). Panel (E) displays the experimental workflow of our RNA and MS (DDA and DIA) analyses: tumor tissues were cut into slices and processed by ALLPREP. RNA and protein fractions were extracted and processed from ALLPREP sample preparation for downstream RNA-sequencing and DDA/DIA MS, respectively. Tissue slices were prepared only for downstream MS (DDA/DIA). Samples for DDA were fractionated using strong anion exchange columns (SAX, six fractions) to enable higher proteome coverage. DDA data (i) was submitted to with MaxQuant processing to derive protein abundances and (ii) to the MakeGTL workflow to generate a spectral library for downstream DIA search. RNA-seq data was processed using the standard DESeq2 workflow (see Experimental Procedures). Abbreviations: DDA: data-dependent acquisition, DIA: data-independent acquisition, ER: estrogen receptor, FDR: false discovery rate, FT: flow-through, MS: mass spectrometry, RT: retention time, WTL: whole tissue lysate.
Figure 2
Figure 2
Overall comparison between transcriptomic and proteomic data layers. Panel (A) displays the dynamic range (presented as relative abundance over total signal) of transcript and protein intensities of matching identifications in our RNA (green), DDA (red), and DIA (blue) MS data (examples of transcript–protein pairs displaying similar abundances across data layers are labeled). Distributions of Spearman correlations between matching transcript and protein (DDA: top, DIA: bottom) abundances are displayed in panel (B) (gray: nonsignificant, light blue: significant), while examples of consistent positive and negative correlation between protein levels (DDA and DIA) and RNA abundance are depicted in panel (C). Panels (D) and (E) display the distribution of transcript–protein correlations for significant (q-value < 0.15, see Experimental Procedures for details) GOBP pathways out of our DDA and DIA MS analyses, respectively. Color gradient is representative of the low (pink) and high (dark red) median transcript–protein correlation for each GOBP term. Acronyms: DDA: data-dependent acquisition, DIA: data-independent acquisition, ER: estrogen receptor, GOBP: gene ontology biological process.
Figure 3
Figure 3
Comparison between transcriptomic and proteomic data in the context of the estrogen receptor and appearance statuses. Panels (A) and (B) display all transcript–protein pairs scaled Log2Ratios for the ER status (A) and appearance ((B); DDA: left, DIA: right). Significant differential expression at the RNA level is marked by full dots and in bigger size; concordance and discordance between RNA and protein layers are shown in green and purple, respectively). Most significant genes (top 5% quantile) are shown in labels. GSEA analyses were performed on all data layers (RNA, DDA, and DIA) for ER and spiculation statuses using the Hallmark database. Pathways are ranked based on the RNA-level enrichment score. Panel (C) displays the overlap of GSEA analyses for the ER status, while panel (D) shows the results of analysis of appearance features (i.e., spiculation vs no spiculation). Significant pathways in each data layer (RNA: green, DDA: red, DIA: blue) are marked in full color, while transparent ones did not pass the false discovery rate (FDR < 0.25) cutoff. Positive scores mark enrichment in ER-positive and spiculated tumors, respectively, while negative scores define enrichments in ER-negative and nonspiculated samples. Acronyms: DDA: data-dependent acquisition, DIA: data-independent acquisition, ER: estrogen receptor, FDR: false discovery rate, GSEA: gene set enrichment analysis.
Figure 4
Figure 4
Pathway-level comparison of transcript–protein pairs. The figure displays transcript–protein-wise comparison within significant pathways out of GSEA analyses for the ER status (estrogen response early, (A)) and appearance (epithelial mesenchymal transition, (B)). Left panels display Log2Ratios of each transcript/protein (ranked by RNA expression) between ER-positive/negative and spiculated/nonspiculated tumors, while center panels display the corresponding enrichment scores in each data layer (RNA: green, DDA: red, DIA: blue). Right panels show distribution of enrichment scores for core-enriched (red) and noncore-enriched (gray) transcript/proteins. Left and center plots background color denotes enrichment in ER-positive (blue) and ER-negative (red) groups and spiculated (orange) and nonspiculated (purple) tumor groups. Abbreviations: DDA: data-dependent acquisition, DIA: data-independent acquisition, ER: estrogen receptor, FDR: false discovery rate, GSEA: gene set enrichment analysis.
Figure 5
Figure 5
Evaluation of differential transcript usage and single amino acid variant detection at the proteomic level. We employed transcriptomic data information to search our DIA data for DTU (A–C) and SAAVs (D, E). For DTU analysis, we employed the BANDITs workflow to define transcript differential expression to then generate an isoform-aware spectral library with which to search our DIA MS data. Panel (A) displays detected DTU at the protein (DIA MS) level and their expression compared to transcript levels. Examples of transcript (left) and (when detected) their specific peptide (right) expression are shown in panel (B) (ER status) and (C) (appearance). t Test p-value is shown for box-plots (peptide level). For SAAV detection, nonsynonymous SNVs detected at the RNA level in breast tumors and healthy breast tissues derived from reconstruction surgery were employed to define a variant-specific library against which the DIA data was searched. Panel (D) shows in which samples (healthy breast tissue and cancer) each variant was detected (Numbers in brackets represent peptide charge). Abbreviations: DIA: data-independent acquisition, DTU: differential transcript usage, MS: mass spectrometry, SAAV: single amino acid variant, SNV: single nucleotide variant.
Figure 6
Figure 6
Protein cluster regulation dependent on the estrogen receptor status. Co-regulated protein clusters in ER-positive (left) and ER-negative (right) tumors (see Figure S15) were extracted from the DIA data, annotated with GOBP terms, condensed, and visualized in Cytoscape (A). Edge thickness and length relate to the cluster distance (Euclidean), the node color relates to the scaled mean intensity of all proteins in each cluster, and the node size depends on the number of proteins in each cluster. Panel (B) shows the correlation to mRNA of each protein per cluster for ER-positive and ER-negative tumors. Panel (C) displays differences in correlation to RNA between ER-positive and ER-negative (i.e., ER positive–ER negative) tumor groups within showcased co-regulation clusters for FDA drug targets. Abbreviations: DIA: data-independent acquisition, ER: estrogen receptor, FDA: Food and Drug Administration, GOBP: gene ontology biological process, MS: mass spectrometry.

Similar articles

Cited by

References

    1. DeSantis C. E.; Ma J.; Gaudet M. M.; Newman L. A.; Miller K. D.; Goding Sauer A.; Jemal A.; Siegel R. L. Breast Cancer Statistics, 2019. CA. Cancer J. Clin. 2019, 438–451. 10.3322/caac.21583. - DOI - PubMed
    1. Fachal L.; Aschard H.; Beesley J.; Barnes D. R.; Allen J.; Kar S.; Pooley K. A.; Dennis J.; Michailidou K.; Turman C.; Soucy P.; Lemaçon A.; Lush M.; Tyrer J. P.; Ghoussaini M.; Marjaneh M. M.; Jiang X.; Agata S.; Aittomäki K.; Alonso M. R.; Andrulis I. L.; Anton-Culver H.; Antonenkova N. N.; Arason A.; Arndt V.; Aronson K. J.; Arun B. K.; Auber B.; Auer P. L.; Azzollini J.; Balmaña J.; Barkardottir R. B.; Barrowdale D.; Beeghly-Fadiel A.; Benitez J.; Bermisheva M.; Białkowska K.; Blanco A. M.; Blomqvist C.; Blot W.; Bogdanova N. V.; Bojesen S. E.; Bolla M. K.; Bonanni B.; Borg A.; Bosse K.; Brauch H.; Brenner H.; Briceno I.; Brock I. W.; Brooks-Wilson A.; Brüning T.; Burwinkel B.; Buys S. S.; Cai Q.; Caldés T.; Caligo M. A.; Camp N. J.; Campbell I.; Canzian F.; Carroll J. S.; Carter B. D.; Castelao J. E.; Chiquette J.; Christiansen H.; Chung W. K.; Claes K. B. M.; Clarke C. L.; Collée J. M.; Cornelissen S.; Couch F. J.; Cox A.; Cross S. S.; Cybulski C.; Czene K.; Daly M. B.; de la Hoya M.; Devilee P.; Diez O.; Ding Y. C.; Dite G. S.; Domchek S. M.; Dörk T.; Dos-Santos-Silva I.; Droit A.; Dubois S.; Dumont M.; Duran M.; Durcan L.; Dwek M.; Eccles D. M.; Engel C.; Eriksson M.; Evans D. G.; Fasching P. A.; Fletcher O.; Floris G.; Flyger H.; Foretova L.; Foulkes W. D.; Friedman E.; Fritschi L.; Frost D.; Gabrielson M.; Gago-Dominguez M.; Gambino G.; Ganz P. A.; Gapstur S. M.; Garber J.; García-Sáenz J. A.; Gaudet M. M.; Georgoulias V.; Giles G. G.; Glendon G.; Godwin A. K.; Goldberg M. S.; Goldgar D. E.; González-Neira A.; Tibiletti M. G.; Greene M. H.; Grip M.; Gronwald J.; Grundy A.; Guénel P.; Hahnen E.; Haiman C. A.; Håkansson N.; Hall P.; Hamann U.; Harrington P. A.; Hartikainen J. M.; Hartman M.; He W.; Healey C. S.; Heemskerk-Gerritsen B. A. M.; Heyworth J.; Hillemanns P.; Hogervorst F. B. L.; Hollestelle A.; Hooning M. J.; Hopper J. L.; Howell A.; Huang G.; Hulick P. J.; Imyanitov E. N.; Isaacs C.; Iwasaki M.; Jager A.; Jakimovska M.; Jakubowska A.; James P. A.; Janavicius R.; Jankowitz R. C.; John E. M.; Johnson N.; Jones M. E.; Jukkola-Vuorinen A.; Jung A.; Kaaks R.; Kang D.; Kapoor P. M.; Karlan B. Y.; Keeman R.; Kerin M. J.; Khusnutdinova E.; Kiiski J. I.; Kirk J.; Kitahara C. M.; Ko Y.-D.; Konstantopoulou I.; Kosma V.-M.; Koutros S.; Kubelka-Sabit K.; Kwong A.; Kyriacou K.; Laitman Y.; Lambrechts D.; Lee E.; Leslie G.; Lester J.; Lesueur F.; Lindblom A.; Lo W.-Y.; Long J.; Lophatananon A.; Loud J. T.; Lubiński J.; MacInnis R. J.; Maishman T.; Makalic E.; Mannermaa A.; Manoochehri M.; Manoukian S.; Margolin S.; Martinez M. E.; Matsuo K.; Maurer T.; Mavroudis D.; Mayes R.; McGuffog L.; McLean C.; Mebirouk N.; Meindl A.; Miller A.; Miller N.; Montagna M.; Moreno F.; Muir K.; Mulligan A. M.; Muñoz-Garzon V. M.; Muranen T. A.; Narod S. A.; Nassir R.; Nathanson K. L.; Neuhausen S. L.; Nevanlinna H.; Neven P.; Nielsen F. C.; Nikitina-Zake L.; Norman A.; Offit K.; Olah E.; Olopade O. I.; Olsson H.; Orr N.; Osorio A.; Pankratz V. S.; Papp J.; Park S. K.; Park-Simon T.-W.; Parsons M. T.; Paul J.; Pedersen I. S.; Peissel B.; Peshkin B.; Peterlongo P.; Peto J.; Plaseska-Karanfilska D.; Prajzendanc K.; Prentice R.; Presneau N.; Prokofyeva D.; Pujana M. A.; Pylkäs K.; Radice P.; Ramus S. J.; Rantala J.; Rau-Murthy R.; Rennert G.; Risch H. A.; Robson M.; Romero A.; Rossing M.; Saloustros E.; Sánchez-Herrero E.; Sandler D. P.; Santamariña M.; Saunders C.; Sawyer E. J.; Scheuner M. T.; Schmidt D. F.; Schmutzler R. K.; Schneeweiss A.; Schoemaker M. J.; Schöttker B.; Schürmann P.; Scott C.; Scott R. J.; Senter L.; Seynaeve C. M.; Shah M.; Sharma P.; Shen C.-Y.; Shu X.-O.; Singer C. F.; Slavin T. P.; Smichkoska S.; Southey M. C.; Spinelli J. J.; Spurdle A. B.; Stone J.; Stoppa-Lyonnet D.; Sutter C.; Swerdlow A. J.; Tamimi R. M.; Tan Y. Y.; Tapper W. J.; Taylor J. A.; Teixeira M. R.; Tengström M.; Teo S. H.; Terry M. B.; Teulé A.; Thomassen M.; Thull D. L.; Tischkowitz M.; Toland A. E.; Tollenaar R. A. E. M.; Tomlinson I.; Torres D.; Torres-Mejía G.; Troester M. A.; Truong T.; Tung N.; Tzardi M.; Ulmer H.-U.; Vachon C. M.; van Asperen C. J.; van der Kolk L. E.; van Rensburg E. J.; Vega A.; Viel A.; Vijai J.; Vogel M. J.; Wang Q.; Wappenschmidt B.; Weinberg C. R.; Weitzel J. N.; Wendt C.; Wildiers H.; Winqvist R.; Wolk A.; Wu A. H.; Yannoukakos D.; Zhang Y.; Zheng W.; Hunter D.; Pharoah P. D. P.; Chang-Claude J.; García-Closas M.; Schmidt M. K.; Milne R. L.; Kristensen V. N.; French J. D.; Edwards S. L.; Antoniou A. C.; Chenevix-Trench G.; Simard J.; Easton D. F.; Kraft P.; Dunning A. M. Fine-Mapping of 150 Breast Cancer Risk Regions Identifies 191 Likely Target Genes. Nat. Genet. 2020, 52, 56–73. 10.1038/s41588-019-0537-1. - DOI - PMC - PubMed
    1. Perou C. M.; Sørlie T.; Eisen M. B.; van de Rijn M.; Jeffrey S. S.; Rees C. a.; Pollack J. R.; Ross D. T.; Johnsen H.; Akslen L. A.; Fluge O.; Pergamenschikov A.; Williams C.; Zhu S. X.; Lønning P. E.; Børresen-Dale A. L.; Brown P. O.; Botstein D. Molecular Portraits of Human Breast Tumours. Nature 2000, 406, 747–752. 10.1038/35021093. - DOI - PubMed
    1. Ali H. R.; Rueda O. M.; Chin S.-F.; Curtis C.; Dunning M. J.; Aparicio S. A. J. R.; Caldas C. Genome-Driven Integrated Classification of Breast Cancer Validated in over 7, 500 Samples. Genome Biol. 2014, 15, 431.10.1186/s13059-014-0431-1. - DOI - PMC - PubMed
    1. Coates A. S.; Winer E. P.; Goldhirsch A.; Gelber R. D.; Gnant M.; Piccart-Gebhart M.; Thürlimann B.; Senn H.-J. Panel Members. Tailoring Therapies--Improving the Management of Early Breast Cancer: St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2015. Ann. Oncol. Off. J. Eur. Soc. Med. Oncol. 2015, 26, 1533–1546. 10.1093/annonc/mdv221. - DOI - PMC - PubMed

Publication types