Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data
- PMID: 34320340
- DOI: 10.1016/j.celrep.2021.109442
Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data
Abstract
Transcriptomic analysis plays a key role in biomedical research. Linear dimensionality reduction methods, especially principal-component analysis (PCA), are widely used in detecting sample-to-sample heterogeneity, while recently developed non-linear methods, such as t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP), can efficiently cluster heterogeneous samples in single-cell RNA sequencing analysis. Yet, the application of t-SNE and UMAP in bulk transcriptomic analysis and comparison with conventional methods have not been achieved. We compare four major dimensionality reduction methods (PCA, multidimensional scaling [MDS], t-SNE, and UMAP) in analyzing 71 large bulk transcriptomic datasets. UMAP is superior to PCA and MDS but shows some advantages over t-SNE in differentiating batch effects, identifying pre-defined biological groups, and revealing in-depth clusters in two-dimensional space. Importantly, UMAP generates sample clusters uncovering biological features and clinical meaning. We recommend deploying UMAP in visualizing and analyzing sizable bulk transcriptomic datasets to reinforce sample heterogeneity analysis.
Keywords: PCA; UMAP; bulk transcriptomics; clustering structure; dimensionality reduction; heterogeneity analysis; t-SNE.
Copyright © 2021 The Author(s). Published by Elsevier Inc. All rights reserved.
Conflict of interest statement
Declaration of interests The authors declare no competing interests.
Similar articles
-
A cross entropy test allows quantitative statistical comparison of t-SNE and UMAP representations.Cell Rep Methods. 2023 Jan 13;3(1):100390. doi: 10.1016/j.crmeth.2022.100390. eCollection 2023 Jan 23. Cell Rep Methods. 2023. PMID: 36814837 Free PMC article.
-
Evaluation of Distance Metrics and Spatial Autocorrelation in Uniform Manifold Approximation and Projection Applied to Mass Spectrometry Imaging Data.Anal Chem. 2019 May 7;91(9):5706-5714. doi: 10.1021/acs.analchem.8b05827. Epub 2019 Apr 25. Anal Chem. 2019. PMID: 30986042
-
Capturing discrete latent structures: choose LDs over PCs.Biostatistics. 2022 Dec 12;24(1):1-16. doi: 10.1093/biostatistics/kxab030. Biostatistics. 2022. PMID: 34467372 Free PMC article.
-
Computational solutions for spatial transcriptomics.Comput Struct Biotechnol J. 2022 Sep 1;20:4870-4884. doi: 10.1016/j.csbj.2022.08.043. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 36147664 Free PMC article. Review.
-
Neural manifold analysis of brain circuit dynamics in health and disease.J Comput Neurosci. 2023 Feb;51(1):1-21. doi: 10.1007/s10827-022-00839-3. Epub 2022 Dec 16. J Comput Neurosci. 2023. PMID: 36522604 Free PMC article. Review.
Cited by
-
Multifaceted analysis of cross-tissue transcriptomes reveals phenotype-endotype associations in atopic dermatitis.Nat Commun. 2023 Oct 2;14(1):6133. doi: 10.1038/s41467-023-41857-8. Nat Commun. 2023. PMID: 37783685 Free PMC article.
-
A comparative analysis of Marburg virus-infected bat and human models from public high-throughput sequencing data.Int J Med Sci. 2025 Jan 1;22(1):1-16. doi: 10.7150/ijms.100696. eCollection 2025. Int J Med Sci. 2025. PMID: 39744175 Free PMC article.
-
Why Symptoms Linger in Quiescent Crohn's Disease: Investigating the Impact of Sulfidogenic Microbes and Sulfur Metabolic Pathways.Inflamm Bowel Dis. 2025 Mar 3;31(3):763-776. doi: 10.1093/ibd/izae238. Inflamm Bowel Dis. 2025. PMID: 39541261
-
PICAFlow: a complete R workflow dedicated to flow/mass cytometry data, from pre-processing to deep and comprehensive analysis.Bioinform Adv. 2023 Dec 4;3(1):vbad177. doi: 10.1093/bioadv/vbad177. eCollection 2023. Bioinform Adv. 2023. PMID: 38089110 Free PMC article.
-
Blood Transcriptome Analysis of Septic Patients Reveals a Long Non-Coding Alu-RNA in the Complement C5a Receptor 1 Gene.Noncoding RNA. 2022 Mar 29;8(2):24. doi: 10.3390/ncrna8020024. Noncoding RNA. 2022. PMID: 35447887 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous