Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 11;12(Suppl 5):97.
doi: 10.1186/s12920-019-0515-6.

CONFIGURE: A pipeline for identifying context specific regulatory modules from gene expression data and its application to breast cancer

Affiliations

CONFIGURE: A pipeline for identifying context specific regulatory modules from gene expression data and its application to breast cancer

Sungjoon Park et al. BMC Med Genomics. .

Abstract

Background: Gene expression data is widely used for identifying subtypes of diseases such as cancer. Differentially expressed gene analysis and gene set enrichment analysis are widely used for identifying biological mechanisms at the gene level and gene set level, respectively. However, the results of differentially expressed gene analysis are difficult to interpret and gene set enrichment analysis does not consider the interactions among genes in a gene set.

Results: We present CONFIGURE, a pipeline that identifies context specific regulatory modules from gene expression data. First, CONFIGURE takes gene expression data and context label information as inputs and constructs regulatory modules. Then, CONFIGURE makes a regulatory module enrichment score (RMES) matrix of enrichment scores of the regulatory modules on samples using the single-sample GSEA method. CONFIGURE calculates the importance scores of the regulatory modules on each context to rank the regulatory modules. We evaluated CONFIGURE on the Cancer Genome Atlas (TCGA) breast cancer RNA-seq dataset to determine whether it can produce biologically meaningful regulatory modules for breast cancer subtypes. We first evaluated whether RMESs are useful for differentiating breast cancer subtypes using a multi-class classifier and one-vs-rest binary SVM classifiers. The multi-class and one-vs-rest binary classifiers were trained using the RMESs as features and outperformed baseline classifiers. Furthermore, we conducted literature surveys on the basal-like type specific regulatory modules obtained by CONFIGURE and showed that highly ranked modules were associated with the phenotypes of basal-like type breast cancers.

Conclusions: We showed that enrichment scores of regulatory modules are useful for differentiating breast cancer subtypes and validated the basal-like type specific regulatory modules by literature surveys. In doing so, we found regulatory module candidates that have not been reported in previous literature. This demonstrates that CONFIGURE can be used to predict novel regulatory markers which can be validated by downstream wet lab experiments. We validated CONFIGURE on the breast cancer RNA-seq dataset in this work but CONFIGURE can be applied to any gene expression dataset containing context information.

Keywords: Breast cancer subtype; Context specific regulatory module; Feature importance score; Gene regulatory network inference; Single sample GSEA.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Overview of CONFIGURE
Fig. 2
Fig. 2
An illustration of a regulatory module

Similar articles

Cited by

  • The Atlas of Inflammation Resolution (AIR).
    Serhan CN, Gupta SK, Perretti M, Godson C, Brennan E, Li Y, Soehnlein O, Shimizu T, Werz O, Chiurchiù V, Azzi A, Dubourdeau M, Gupta SS, Schopohl P, Hoch M, Gjorgevikj D, Khan FM, Brauer D, Tripathi A, Cesnulevicius K, Lescheid D, Schultz M, Särndahl E, Repsilber D, Kruse R, Sala A, Haeggström JZ, Levy BD, Filep JG, Wolkenhauer O. Serhan CN, et al. Mol Aspects Med. 2020 Aug;74:100894. doi: 10.1016/j.mam.2020.100894. Epub 2020 Sep 3. Mol Aspects Med. 2020. PMID: 32893032 Free PMC article. Review.

References

    1. Verhaak RG, Hoadley KA, Purdom E, Wang V, Qi Y, Wilkerson MD, Miller CR, Ding L, Golub T, Mesirov JP, et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in pdgfra, idh1, egfr, and nf1. Cancer cell. 2010;17(1):98–110. - PMC - PubMed
    1. Network CGA, et al. Comprehensive molecular portraits of human breast tumours. Nature. 2012;490(7418):61. - PMC - PubMed
    1. Kiselev VY, Kirschner K, Schaub MT, Andrews T, Yiu A, Chandra T, Natarajan KN, Reik W, Barahona M, Green AR, et al. Sc3: consensus clustering of single-cell rna-seq data. Nat Methods. 2017;14(5):483. - PMC - PubMed
    1. Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci. 2001;98(9):5116–21. - PMC - PubMed
    1. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for rna-seq data with deseq2. Genome Biol. 2014;15(12):550. - PMC - PubMed

Publication types