Investigating the correspondence between transcriptomic and proteomic expression profiles using coupled cluster models
- PMID: 18974169
- PMCID: PMC4141638
- DOI: 10.1093/bioinformatics/btn553
Investigating the correspondence between transcriptomic and proteomic expression profiles using coupled cluster models
Abstract
Motivation: Modern transcriptomics and proteomics enable us to survey the expression of RNAs and proteins at large scales. While these data are usually generated and analyzed separately, there is an increasing interest in comparing and co-analyzing transcriptome and proteome expression data. A major open question is whether transcriptome and proteome expression is linked and how it is coordinated.
Results: Here we have developed a probabilistic clustering model that permits analysis of the links between transcriptomic and proteomic profiles in a sensible and flexible manner. Our coupled mixture model defines a prior probability distribution over the component to which a protein profile should be assigned conditioned on which component the associated mRNA profile belongs to. We apply this approach to a large dataset of quantitative transcriptomic and proteomic expression data obtained from a human breast epithelial cell line (HMEC). The results reveal a complex relationship between transcriptome and proteome with most mRNA clusters linked to at least two protein clusters, and vice versa. A more detailed analysis incorporating information on gene function from the Gene Ontology database shows that a high correlation of mRNA and protein expression is limited to the components of some molecular machines, such as the ribosome, cell adhesion complexes and the TCP-1 chaperonin involved in protein folding.
Availability: Matlab code is available from the authors on request.
Figures




Similar articles
-
Parallel mRNA, proteomics and miRNA expression analysis in cell line models of the intestine.World J Gastroenterol. 2017 Nov 7;23(41):7369-7386. doi: 10.3748/wjg.v23.i41.7369. World J Gastroenterol. 2017. PMID: 29151691 Free PMC article.
-
A novel approach for clustering proteomics data using Bayesian fast Fourier transform.Bioinformatics. 2005 May 15;21(10):2210-24. doi: 10.1093/bioinformatics/bti383. Epub 2005 Mar 15. Bioinformatics. 2005. PMID: 15769836
-
Integration of transcriptomic and proteomic data identifies biological functions in cell populations from human infant lung.Am J Physiol Lung Cell Mol Physiol. 2019 Sep 1;317(3):L347-L360. doi: 10.1152/ajplung.00475.2018. Epub 2019 Jul 3. Am J Physiol Lung Cell Mol Physiol. 2019. PMID: 31268347 Free PMC article.
-
Workability of mRNA Sequencing for Predicting Protein Abundance.Genes (Basel). 2023 Nov 11;14(11):2065. doi: 10.3390/genes14112065. Genes (Basel). 2023. PMID: 38003008 Free PMC article. Review.
-
Polyploidy and the proteome.Biochim Biophys Acta. 2016 Aug;1864(8):896-907. doi: 10.1016/j.bbapap.2016.03.010. Epub 2016 Mar 16. Biochim Biophys Acta. 2016. PMID: 26993527 Review.
Cited by
-
Impact of pyrrolidine-bispyrrole DNA minor groove binding agents and chirality on global proteomic profile in Escherichia Coli.Proteome Sci. 2013 May 23;11(1):23. doi: 10.1186/1477-5956-11-23. Proteome Sci. 2013. PMID: 23702249 Free PMC article.
-
Reverse phase protein arrays in signaling pathways: a data integration perspective.Drug Des Devel Ther. 2015 Jul 7;9:3519-27. doi: 10.2147/DDDT.S38375. eCollection 2015. Drug Des Devel Ther. 2015. PMID: 26185419 Free PMC article. Review.
-
DNA sequencing, genomes and genetic markers of microbes on fruits and vegetables.Microb Biotechnol. 2021 Mar;14(2):323-362. doi: 10.1111/1751-7915.13560. Epub 2020 Mar 24. Microb Biotechnol. 2021. PMID: 32207561 Free PMC article. Review.
-
The emerging role of disease-associated microglia in Parkinson's disease.Front Cell Neurosci. 2024 Nov 5;18:1476461. doi: 10.3389/fncel.2024.1476461. eCollection 2024. Front Cell Neurosci. 2024. PMID: 39564189 Free PMC article. Review.
-
Towards systems biological understanding of leaf senescence.Plant Mol Biol. 2013 Aug;82(6):519-28. doi: 10.1007/s11103-012-9974-2. Epub 2012 Oct 13. Plant Mol Biol. 2013. PMID: 23065109 Review.
References
-
- Alizadeh A, et al. Different types of diffuse large b-cell lymphoma identified by gene expressing profiling. Nature. 2000;403:503–511. - PubMed
-
- Barker N, et al. The Yin Yang of TCF/beta-catenin signaling. Adv. Cancer Res. 2000;77:1–24. - PubMed
-
- Chen G, et al. Discordant protein and mRNA expression in lung adenocarcinomas. Mol. Cell. Proteomics. 2002;1:304–313. - PubMed
-
- Chudova D, et al. Gene expression clustering with functional mixture models. In: Thrun S, editor. Advances in Neural Information Processing Systems. Vol. 16 2004.
-
- Dempster A, et al. Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. B. 1977;39:1–38.