Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 9;19(3):e1010342.
doi: 10.1371/journal.pcbi.1010342. eCollection 2023 Mar.

Detection of genes with differential expression dispersion unravels the role of autophagy in cancer progression

Affiliations

Detection of genes with differential expression dispersion unravels the role of autophagy in cancer progression

Christophe Le Priol et al. PLoS Comput Biol. .

Abstract

The majority of gene expression studies focus on the search for genes whose mean expression is different between two or more populations of samples in the so-called "differential expression analysis" approach. However, a difference in variance in gene expression may also be biologically and physiologically relevant. In the classical statistical model used to analyze RNA-sequencing (RNA-seq) data, the dispersion, which defines the variance, is only considered as a parameter to be estimated prior to identifying a difference in mean expression between conditions of interest. Here, we propose to evaluate four recently published methods, which detect differences in both the mean and dispersion in RNA-seq data. We thoroughly investigated the performance of these methods on simulated datasets and characterized parameter settings to reliably detect genes with a differential expression dispersion. We applied these methods to The Cancer Genome Atlas datasets. Interestingly, among the genes with an increased expression dispersion in tumors and without a change in mean expression, we identified some key cellular functions, most of which were related to catabolism and were overrepresented in most of the analyzed cancers. In particular, our results highlight autophagy, whose role in cancerogenesis is context-dependent, illustrating the potential of the differential dispersion approach to gain new insights into biological processes and to discover new biomarkers.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Ability to identify differentially dispersed genes.
The performances of Levene’s test, MDSeq, DiPhiSeq, GAMLSS and DiffDist for differential dispersion detection in gene expression data, as measured by the area under the ROC curve (AUC), were assessed using 10 replicates of simulated datasets composed of highly and lowly differentially expressed genes between two sample populations of equal size.
Fig 2
Fig 2. Ability to detect differential dispersion for lowly differentially expressed genes.
False discovery rate (FDR) and true positive rate (TPR) of Levene’s test, MDSeq, DiPhiSeq, GAMLSS and DiffDist for differential dispersion detection in simulated datasets composed of lowly differentially expressed genes between two sample populations of equal size. The performances were assessed using 10 replicates of simulated datasets per parameter setting.
Fig 3
Fig 3. Differentially dispersed genes correctly identified by the evaluated methods among lowly differentially expressed genes.
(A) Intersections of sets of differentially dispersed (DD) genes correctly identified by Levene’s test, MDSeq, DiPhiSeq, GAMLSS and DiffDist. (B): Correctness of dispersion log2-fold change sign of DD genes correctly identified by the different methods. (C) Real mean and dispersion log2-fold changes and estimated dispersion log2-fold changes of DD genes correctly identified by GAMLSS and DiffDist. (D) Correctness of dispersion log2-fold change signs according to Levene’s test, MDSeq and DiPhiSeq for DD genes correctly identified by GAMLSS and DiffDist with incorrect dispersion log2-fold change sign. Simulated datasets are composed of lowly differentially expressed genes with a mean fold change of expression between 1 and 1.5 between two populations of 50 samples. Values indicated at the middle of the bars are percentages of the corresponding categories of genes over the entire sets of analyzed genes. All results relate to 10 replicates of simulated datasets, e.g. the counts and percentages are averaged over all the replicates.
Fig 4
Fig 4. Differentially dispersed genes among non-differentially expressed genes for each TCGA dataset.
(A) Number of differentially expressed (DE) genes separated between those upregulated in tumors (DE+) and those downregulated in tumors (DE-) detected by MDSeq per TCGA dataset. (B) Number of differentially dispersed (DD) genes among non-DE genes separated between those overdispersed in tumors (DD+) and those underdispersed in tumors (DD-), as detected by Levene’s test, MDSeq, DiPhiSeq, GAMLSS and DiffDist, per TCGA dataset.
Fig 5
Fig 5. Overdispersed genes in tumors identified by the evaluated methods among non-differentially expressed genes.
Intersections of sets of overdispersed genes in tumors identified by Levene’s test, MDSeq, DiPhiSeq, GAMLSS and DiffDist among non-differentially expressed genes for (A) the kidney renal clear cell carcinoma dataset (TCGA-KIRC) and (B) the kidney renal papillary cell carcinoma dataset (TCGA-KIRP). Non-differentially expressed genes were identified by MDSeq.
Fig 6
Fig 6. Enriched GO terms among overdispersed genes in tumors identified by the evaluated methods.
Top 40 representative enriched Gene Ontology (GO) terms among overdispersed genes in tumors (DD+) among non-differentially expressed (non-DE) genes, ordered first by the number of datasets for which they are enriched (decreasing order) and second by the mean p-values of enrichment across all datasets (increasing order). Non-DE genes were identified using MDSeq, and DD+ genes were identified among non-differentially expressed genes by at least one of the evaluated methods, i.e. Levene’s test, MDSeq, DiPhiSeq, GAMLSS and DiffDist.

Similar articles

References

    1. Shendure J, Lieberman Aiden E. The expanding scope of DNA sequencing. Nat Biotechnol. 2012;30(11):1084–1094. doi: 10.1038/nbt.2421 - DOI - PMC - PubMed
    1. Peixoto A, Monteiro M, Rocha B, Veiga-Fernandes H. Quantification of multiple gene expression in individual cells. Genome Res. 2004;14(10A):1938–1947. doi: 10.1101/gr.2890204 - DOI - PMC - PubMed
    1. Son CG, Bilke S, Davis S, Greer BT, Wei JS, Whiteford CC, et al.. Database of mRNA gene expression profiles of multiple human organs. Genome Res. 2005;15(3):443–450. doi: 10.1101/gr.3124505 - DOI - PMC - PubMed
    1. Escrich E, Moral R, García G, Costa I, Sánchez JA, Solanas M. Identification of novel differentially expressed genes by the effect of a high-fat n-6 diet in experimental breast cancer. Mol Carcinog. 2004;40(2):73–78. doi: 10.1002/mc.20028 - DOI - PubMed
    1. Shen Y, Wang X, Jin Y, Lu J, Qiu G, Wen X. Differentially expressed genes and interacting pathways in bladder cancer revealed by bioinformatic analysis. Mol Med Rep. 2014;10(4):1746–1752. doi: 10.3892/mmr.2014.2396 - DOI - PMC - PubMed

Publication types