Leveraging gene correlations in single cell transcriptomic data
- PMID: 39294560
- PMCID: PMC11411778
- DOI: 10.1186/s12859-024-05926-z
Leveraging gene correlations in single cell transcriptomic data
Abstract
Background: Many approaches have been developed to overcome technical noise in single cell RNA-sequencing (scRNAseq). As researchers dig deeper into data-looking for rare cell types, subtleties of cell states, and details of gene regulatory networks-there is a growing need for algorithms with controllable accuracy and fewer ad hoc parameters and thresholds. Impeding this goal is the fact that an appropriate null distribution for scRNAseq cannot simply be extracted from data in which ground truth about biological variation is unknown (i.e., usually).
Results: We approach this problem analytically, assuming that scRNAseq data reflect only cell heterogeneity (what we seek to characterize), transcriptional noise (temporal fluctuations randomly distributed across cells), and sampling error (i.e., Poisson noise). We analyze scRNAseq data without normalization-a step that skews distributions, particularly for sparse data-and calculate p values associated with key statistics. We develop an improved method for selecting features for cell clustering and identifying gene-gene correlations, both positive and negative. Using simulated data, we show that this method, which we call BigSur (Basic Informatics and Gene Statistics from Unnormalized Reads), captures even weak yet significant correlation structures in scRNAseq data. Applying BigSur to data from a clonal human melanoma cell line, we identify thousands of correlations that, when clustered without supervision into gene communities, align with known cellular components and biological processes, and highlight potentially novel cell biological relationships.
Conclusions: New insights into functionally relevant gene regulatory networks can be obtained using a statistically grounded approach to the identification of gene-gene correlations.
Keywords: Gene co-expression network; Gene regulatory network; Gene–gene correlation; Melanoma; Single cell RNA sequencing.
© 2024. The Author(s).
Conflict of interest statement
The authors declare that they have no competing interests
Figures







Update of
-
Leveraging gene correlations in single cell transcriptomic data.bioRxiv [Preprint]. 2023 Nov 1:2023.03.14.532643. doi: 10.1101/2023.03.14.532643. bioRxiv. 2023. Update in: BMC Bioinformatics. 2024 Sep 18;25(1):305. doi: 10.1186/s12859-024-05926-z. PMID: 36993765 Free PMC article. Updated. Preprint.
Similar articles
-
Leveraging gene correlations in single cell transcriptomic data.bioRxiv [Preprint]. 2023 Nov 1:2023.03.14.532643. doi: 10.1101/2023.03.14.532643. bioRxiv. 2023. Update in: BMC Bioinformatics. 2024 Sep 18;25(1):305. doi: 10.1186/s12859-024-05926-z. PMID: 36993765 Free PMC article. Updated. Preprint.
-
Shrinkage estimation of gene interaction networks in single-cell RNA sequencing data.BMC Bioinformatics. 2024 Oct 26;25(1):339. doi: 10.1186/s12859-024-05946-9. BMC Bioinformatics. 2024. PMID: 39462345 Free PMC article.
-
TENET: gene network reconstruction using transfer entropy reveals key regulatory factors from single cell transcriptomic data.Nucleic Acids Res. 2021 Jan 11;49(1):e1. doi: 10.1093/nar/gkaa1014. Nucleic Acids Res. 2021. PMID: 33170214 Free PMC article.
-
Identifying cell populations with scRNASeq.Mol Aspects Med. 2018 Feb;59:114-122. doi: 10.1016/j.mam.2017.07.002. Epub 2017 Jul 25. Mol Aspects Med. 2018. PMID: 28712804 Review.
-
Revealing the molecular landscape of human placenta: a systematic review and meta-analysis of single-cell RNA sequencing studies.Hum Reprod Update. 2024 Jul 1;30(4):410-441. doi: 10.1093/humupd/dmae006. Hum Reprod Update. 2024. PMID: 38478759 Free PMC article.
Cited by
-
Statistically principled feature selection for single cell transcriptomics.bioRxiv [Preprint]. 2024 Oct 15:2024.10.11.617709. doi: 10.1101/2024.10.11.617709. bioRxiv. 2024. PMID: 39463971 Free PMC article. Preprint.
-
SciGeneX: enhancing transcriptional analysis through gene module detection in single-cell and spatial transcriptomics data.NAR Genom Bioinform. 2025 Apr 17;7(2):lqaf043. doi: 10.1093/nargab/lqaf043. eCollection 2025 Jun. NAR Genom Bioinform. 2025. PMID: 40248490 Free PMC article.
-
Uncovering minimal pathways in melanoma initiation.Nat Commun. 2025 Jun 26;16(1):5369. doi: 10.1038/s41467-025-60742-0. Nat Commun. 2025. PMID: 40571713 Free PMC article.
References
-
- Tritschler S, Buttner M, Fischer DS, Lange M, Bergen V, Lickert H, Theis FJ. Concepts and limitations for learning developmental trajectories from single cell genomics. Development. 2019;146:dev170506. - PubMed
-
- Tam PPL, Ho JWK. Cellular diversity and lineage trajectory: insights from mouse single cell transcriptomes. Development. 2020;147:dev179788. - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous