Identification of Type 2 Diabetes Biomarkers From Mixed Single-Cell Sequencing Data With Feature Selection Methods
- PMID: 35721855
- PMCID: PMC9201257
- DOI: 10.3389/fbioe.2022.890901
Identification of Type 2 Diabetes Biomarkers From Mixed Single-Cell Sequencing Data With Feature Selection Methods
Abstract
Diabetes is the most common disease and a major threat to human health. Type 2 diabetes (T2D) makes up about 90% of all cases. With the development of high-throughput sequencing technologies, more and more fundamental pathogenesis of T2D at genetic and transcriptomic levels has been revealed. The recent single-cell sequencing can further reveal the cellular heterogenicity of complex diseases in an unprecedented way. With the expectation on the molecular essence of T2D across multiple cell types, we investigated the expression profiling of more than 1,600 single cells (949 cells from T2D patients and 651 cells from normal controls) and identified the differential expression profiling and characteristics at the transcriptomics level that can distinguish such two groups of cells at the single-cell level. The expression profile was analyzed by several machine learning algorithms, including Monte Carlo feature selection, support vector machine, and repeated incremental pruning to produce error reduction (RIPPER). On one hand, some T2D-associated genes (MTND4P24, MTND2P28, and LOC100128906) were discovered. On the other hand, we revealed novel potential pathogenic mechanisms in a rule manner. They are induced by newly recognized genes and neglected by traditional bulk sequencing techniques. Particularly, the newly identified T2D genes were shown to follow specific quantitative rules with diabetes prediction potentials, and such rules further indicated several potential functional crosstalks involved in T2D.
Keywords: Monte Carlo feature selection; RIPPER; single-cell sequencing; support vector machine; type 2 diabetes.
Copyright © 2022 Li, Pan and Cai.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures





Similar articles
-
Identification of leukemia stem cell expression signatures through Monte Carlo feature selection strategy and support vector machine.Cancer Gene Ther. 2020 Feb;27(1-2):56-69. doi: 10.1038/s41417-019-0105-y. Epub 2019 May 29. Cancer Gene Ther. 2020. PMID: 31138902
-
Analysis of Expression Pattern of snoRNAs in Different Cancer Types with Machine Learning Algorithms.Int J Mol Sci. 2019 May 2;20(9):2185. doi: 10.3390/ijms20092185. Int J Mol Sci. 2019. PMID: 31052553 Free PMC article.
-
Investigating the gene expression profiles of cells in seven embryonic stages with machine learning algorithms.Genomics. 2020 May;112(3):2524-2534. doi: 10.1016/j.ygeno.2020.02.004. Epub 2020 Feb 8. Genomics. 2020. PMID: 32045671
-
Prediction Performance of Feature Selectors and Classifiers on Highly Dimensional Transcriptomic Data for Prediction of Weight Loss in Filipino Americans at Risk for Type 2 Diabetes.Biol Res Nurs. 2023 Jul;25(3):393-403. doi: 10.1177/10998004221147513. Epub 2023 Jan 4. Biol Res Nurs. 2023. PMID: 36600204 Free PMC article. Review.
-
Recent advances and perspectives in next generation sequencing application to the genetic research of type 2 diabetes.World J Diabetes. 2019 Jul 15;10(7):376-395. doi: 10.4239/wjd.v10.i7.376. World J Diabetes. 2019. PMID: 31363385 Free PMC article. Review.
Cited by
-
Current Status and Prospects of the Single-Cell Sequencing Technologies for Revealing the Pathogenesis of Pregnancy-Associated Disorders.Genes (Basel). 2023 Mar 20;14(3):756. doi: 10.3390/genes14030756. Genes (Basel). 2023. PMID: 36981026 Free PMC article. Review.
-
The oxidative aging model integrated various risk factors in type 2 diabetes mellitus at system level.Front Endocrinol (Lausanne). 2023 May 24;14:1196293. doi: 10.3389/fendo.2023.1196293. eCollection 2023. Front Endocrinol (Lausanne). 2023. PMID: 37293508 Free PMC article.
References
LinkOut - more resources
Full Text Sources