Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model
- PMID: 30450112
- PMCID: PMC6225736
- DOI: 10.3389/fgene.2018.00516
Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model
Abstract
Detection of biomarker genes and their regulatory doses of chemical compounds (DCCs) is one of the most important tasks in toxicogenomic studies as well as in drug design and development. There is an online computational platform "Toxygates" to identify biomarker genes and their regulatory DCCs by co-clustering approach. Nevertheless, the algorithm of that platform based on hierarchical clustering (HC) does not share gene-DCC two-way information simultaneously during co-clustering between genes and DCCs. Also it is sensitive to outlying observations. Thus, this platform may produce misleading results in some cases. The probabilistic hidden variable model (PHVM) is a more effective co-clustering approach that share two-way information simultaneously, but it is also sensitive to outlying observations. Therefore, in this paper we have proposed logistic probabilistic hidden variable model (LPHVM) for robust co-clustering between genes and DCCs, since gene expression data are often contaminated by outlying observations. We have investigated the performance of the proposed LPHVM co-clustering approach in a comparison with the conventional PHVM and Toxygates co-clustering approaches using simulated and real life TGP gene expression datasets, respectively. Simulation results show that the proposed method improved the performance over the conventional PHVM in presence of outliers; otherwise, it keeps equal performance. In the case of real life TGP data analysis, three DCCs (glibenclamide-low, perhexilline-low, and hexachlorobenzene-medium) for glutathione metabolism pathway dataset as well as two DCCs (acetaminophen-medium and methapyrilene-low) for PPAR signaling pathway dataset were incorrectly co-clustered by the Toxygates online platform, while only one DCC (hexachlorobenzene-low) for glutathione metabolism pathway was incorrectly co-clustered by the proposed LPHVM approach. Our findings from the real data analysis are also supported by the other findings in the literature.
Keywords: co-clustering; doses of chemical compounds (DCCs); logistic probabilistic hidden variable model (LPHVM); logistic transformation; outlying observations; probabilistic hidden variable model (PHVM); toxicogenomic biomarker.
Figures



Similar articles
-
Robust hierarchical co-clustering for exploring toxicogenomic biomarkers and their chemical regulators.Sci Rep. 2025 May 14;15(1):16676. doi: 10.1038/s41598-025-99568-7. Sci Rep. 2025. PMID: 40369321 Free PMC article.
-
Assessment of Drugs Toxicity and Associated Biomarker Genes Using Hierarchical Clustering.Medicina (Kaunas). 2019 Aug 8;55(8):451. doi: 10.3390/medicina55080451. Medicina (Kaunas). 2019. PMID: 31398888 Free PMC article.
-
Robust identification of significant interactions between toxicogenomic biomarkers and their regulatory chemical compounds using logistic moving range chart.Comput Biol Chem. 2019 Feb;78:375-381. doi: 10.1016/j.compbiolchem.2018.12.020. Epub 2018 Dec 26. Comput Biol Chem. 2019. PMID: 30606695
-
Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives.Health Technol Assess. 2001;5(33):1-56. doi: 10.3310/hta5330. Health Technol Assess. 2001. PMID: 11701102 Review.
-
Semiautomatic robust regression clustering of international trade data.Stat Methods Appt. 2021;30(3):863-894. doi: 10.1007/s10260-021-00569-3. Epub 2021 Jun 11. Stat Methods Appt. 2021. PMID: 34131421 Free PMC article. Review.
Cited by
-
Robust hierarchical co-clustering for exploring toxicogenomic biomarkers and their chemical regulators.Sci Rep. 2025 May 14;15(1):16676. doi: 10.1038/s41598-025-99568-7. Sci Rep. 2025. PMID: 40369321 Free PMC article.
-
Assessment of Drugs Toxicity and Associated Biomarker Genes Using Hierarchical Clustering.Medicina (Kaunas). 2019 Aug 8;55(8):451. doi: 10.3390/medicina55080451. Medicina (Kaunas). 2019. PMID: 31398888 Free PMC article.
-
An introduction to new robust linear and monotonic correlation coefficients.BMC Bioinformatics. 2021 Mar 31;22(1):170. doi: 10.1186/s12859-021-04098-4. BMC Bioinformatics. 2021. PMID: 33789571 Free PMC article.
-
Impact of climate change on Boro rice production in Bangladesh: Evidence from time series modeling.PLoS One. 2025 Jul 23;20(7):e0328699. doi: 10.1371/journal.pone.0328699. eCollection 2025. PLoS One. 2025. PMID: 40700376 Free PMC article.
-
An investigation of the pigments, antioxidants and free radical scavenging potential of twenty medicinal weeds found in the southern part of Bangladesh.PeerJ. 2024 Jul 23;12:e17698. doi: 10.7717/peerj.17698. eCollection 2024. PeerJ. 2024. PMID: 39071122 Free PMC article.
References
-
- Agostinelli C. C., Leung A., Yohai V. J., Zamar R. H. (2015). Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination. Test 24, 441–461. 10.1007/s11749-015-0450-6 - DOI
-
- Alqallaf F., Van A. S., Yohai V., Zamar R. (2009). Propagation of outliers in multivariate data. Ann. Stat. 37, 311–331. 10.1214/07-AOS588 - DOI
-
- Atkinson A. C. (1982). Regression diagnostics, transformation and constructed variables. J. R. Stat. Soc. Ser. B 44, 1–36.
-
- Bicego M., Lovato P., Ferrarini A., Delledonne M. (2010). Biclustering of expression microarray data with topic models in International Conference on Pattern Recognition (Washington, DC: IEEE Computer Society; ), 2728–2731. 10.1109/ICPR.2010.668 - DOI
LinkOut - more resources
Full Text Sources