Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 19;37(2):221-228.
doi: 10.1093/bioinformatics/btaa678.

Co-phosphorylation networks reveal subtype-specific signaling modules in breast cancer

Affiliations

Co-phosphorylation networks reveal subtype-specific signaling modules in breast cancer

Marzieh Ayati et al. Bioinformatics. .

Abstract

Motivation: Protein phosphorylation is a ubiquitous mechanism of post-translational modification that plays a central role in cellular signaling. Phosphorylation is particularly important in the context of cancer, as downregulation of tumor suppressors and upregulation of oncogenes by the dysregulation of associated kinase and phosphatase networks are shown to have key roles in tumor growth and progression. Despite recent advances that enable large-scale monitoring of protein phosphorylation, these data are not fully incorporated into such computational tasks as phenotyping and subtyping of cancers.

Results: We develop a network-based algorithm, CoPPNet, to enable unsupervised subtyping of cancers using phosphorylation data. For this purpose, we integrate prior knowledge on evolutionary, structural and functional association of phosphosites, kinase-substrate associations and protein-protein interactions with the correlation of phosphorylation of phosphosites across different tumor samples (a.k.a co-phosphorylation) to construct a context-specific-weighted network of phosphosites. We then mine these networks to identify subnetworks with correlated phosphorylation patterns. We apply the proposed framework to two mass-spectrometry-based phosphorylation datasets for breast cancer (BC), and observe that (i) the phosphorylation pattern of the identified subnetworks are highly correlated with clinically identified subtypes, and (ii) the identified subnetworks are highly reproducible across datasets that are derived from different studies. Our results show that integration of quantitative phosphorylation data with network frameworks can provide mechanistic insights into the differences between the signaling mechanisms that drive BC subtypes. Furthermore, the reproducibility of the identified subnetworks suggests that phosphorylation can provide robust classification of disease response and markers.

Availability and implementation: CoPPNet is available at http://compbio.case.edu/coppnet/.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Workflow of CoPPNet. We first construct a PSFA network to represents the functional relationship among phosphosites, using generic KSA, phosphosites associations and PPI data. The nodes of the PSFA network represent phosphosites and the edges represent (1) KSA, (2) phosphosites targeted by a common kinase, (3) functional associations between phosphosites, (4) physical interaction between proteins harboring the sites. For a given phosphorylation dataset collected from multiple cancer samples, we weigh the edges of the PSFA network based on the co-phosphorylation (Co-P) of pairs of sites across these samples. Then, we identify Co-P modules as subnetworks composed of heavy edges in this weighted network. Finally, we comprehensively assess the significance, reproducibility, subtype-specificity and biological relevance of the Co-P modules
Fig. 2.
Fig. 2.
CoPPNet identifies highly significant and reproducible co-phosphorylation (Co-P) modules. (a) Statistical significance of identified subnetworks in two BC datasets. For each dataset, the blue curve shows Co-P scores (y-axis) of the highest scoring 10 subnetworks in decreasing order (rank shown on x-axis). For each rank i on the x-axis, the red (green) curve and error bar show the distribution of the scores of i highest scoring subnetworks in 100 randomized networks obtained by permuting the edge weights (edges). (b) Reproducibility of significant Co-P modules between two independent dataset Huang et al. and Mertin et al. The size of the circles indicates the number of phosphosites in each Co-P module, the number in the circle shows its rank among all identified subnetworks. The thickness of the edges represents the significance of the overlap between the two Co-P modules based on hypergeometric test. (Color version of this figure is available at Bioinformatics online.)
Fig. 3.
Fig. 3.
The phosphorylation sites in top Co-P modules identified in Huang et al. via unsupervised analysis are associated with BC subtypes. The fold change of the phosphosites in each module are sorted in increasing order of average relative phosphorylation in Luminal samples (purple) with respect to the common reference. The green bars represent the average fold change of phosphorylation in Basal samples. (Color version of this figure is available at Bioinformatics online.)
Fig. 4.
Fig. 4.
Kinase–substrate enrichment analysis (KSEA) on Co-P modules reveals kinases that are potentially associated with BC subtype and survival. (a) The heatmap compares two different strategies for inferring kinase activity: on the left, the phosphosites used to infer kinase activity are restricted to two significant modules identified by CoPPNet (Luminalm and Basalm) on Huang et al. dataset. On the right, all phosphosites are used to infer kinase activity (LuminalA and BasalA). The intensity of red indicates the kinases with positive KSEA score (i.e. hyperactive in the respective subtype) and blue indicates the kinases with negative score (i.e. hypoactive in the respective subtype). Kinases that have different patterns of differential activity between subtypes in the modules versus all phosphosites are marked by a star, and their survival analysis using gene expression data is presented in (b). (Color version of this figure is available at Bioinformatics online.)

References

    1. Archer T.C. et al. (2018) Proteomics, post-translational modifications, and integrative analyses reveal molecular heterogeneity within medulloblastoma subgroups. Cancer Cell, 34, 396–410. - PMC - PubMed
    1. Ayati M. et al. (2015) MOBAS: identification of disease-associated protein subnetworks using modularity-based scoring. EURASIP J. Bioinf. Syst. Biol., 2015, 7. - PMC - PubMed
    1. Ayati M. et al. (2019) Cophosk: a method for comprehensive kinase substrate annotation using co-phosphorylation analysis. PLoS Comput. Biol., 15, e1006678. - PMC - PubMed
    1. Ballouz S. et al. (2015) Guidance for RNA-seq co-expression network construction and analysis: safety in numbers. Bioinformatics, 31, 2123–2130. - PubMed
    1. Beli P. et al. (2012) Proteomic investigations reveal a role for RNA processing factor THRAP3 in the DNA damage response. Mol. Cell, 46, 212–225. - PMC - PubMed

Publication types