. 2023 Jan 1;39(1):btac745.

doi: 10.1093/bioinformatics/btac745.

rGREAT: an R/bioconductor package for functional enrichment on genomic regions

Zuguang Gu¹, Daniel Hübschmann^{1

2

3

4}

Affiliations

¹ Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT), Heidelberg 69120, Germany.
² Heidelberg Institute of Stem Cell Technology and Experimental Medicine (HI-STEM), Heidelberg 69120, Germany.
³ German Cancer Consortium (DKTK), Heidelberg 69120, Germany.
⁴ Department of Pediatric Immunology, Hematology and Oncology, University Hospital Heidelberg, Heidelberg, 69120, Germany.

PMID: 36394265
PMCID: PMC9805586
DOI: 10.1093/bioinformatics/btac745

rGREAT: an R/bioconductor package for functional enrichment on genomic regions

Zuguang Gu et al. Bioinformatics. 2023.

. 2023 Jan 1;39(1):btac745.

doi: 10.1093/bioinformatics/btac745.

Authors

Zuguang Gu¹, Daniel Hübschmann^{1

2

3

4}

Affiliations

¹ Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT), Heidelberg 69120, Germany.
² Heidelberg Institute of Stem Cell Technology and Experimental Medicine (HI-STEM), Heidelberg 69120, Germany.
³ German Cancer Consortium (DKTK), Heidelberg 69120, Germany.
⁴ Department of Pediatric Immunology, Hematology and Oncology, University Hospital Heidelberg, Heidelberg, 69120, Germany.

PMID: 36394265
PMCID: PMC9805586
DOI: 10.1093/bioinformatics/btac745

Abstract

Summary: GREAT (Genomic Regions Enrichment of Annotations Tool) is a widely used tool for functional enrichment on genomic regions. However, as an online tool, it has limitations of outdated annotation data, small numbers of supported organisms and gene set collections, and not being extensible for users. Here, we developed a new R/Bioconductorpackage named rGREAT which implements the GREAT algorithm locally. rGREAT by default supports more than 600 organisms and a large number of gene set collections, as well as self-provided gene sets and organisms from users. Additionally, it implements a general method for dealing with background regions.

Availability and implementation: The package rGREAT is freely available from the Bioconductor project: https://bioconductor.org/packages/rGREAT/. The development version is available at https://github.com/jokergoo/rGREAT. Gene Ontology gene sets for more than 600 organisms retrieved from Ensembl BioMart are presented in an R package BioMartGOGeneSets which is available at https://github.com/jokergoo/BioMartGOGeneSets.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
The binomial model of the *GREAT* analysis. (A) Basal domain and extensions around transcription start sites of genes. (B) A region set which is associated with genes in a specific gene set (green segments). The basal domain and its extensions are reduced as a single segment in the figure. (C) Overlapping regions in the region set are merged. The fraction of the genome that is covered by the region set is defined as p. (D) For a list of N input regions, the number of input regions that fall into the region set follows a binomial distribution. (E) When background regions are provided, the fraction of the background regions that is covered by the region set (within the red rectangles) is denoted as p₂. (F) For a list of N input regions, only N₂ input regions that fall into the background are considered. The number of input regions that fall in both region set and background also follows a binomial distribution. The figures are adapted from the original *GREAT* paper (A color version of this figure appears in the online version of this article)

See this image and copyright information in PMC

Cited by

Implications of noncoding regulatory functions in the development of insulinomas.
Ramos-Rodríguez M, Subirana-Granés M, Norris R, Sordi V, Fernández Á, Fuentes-Páez G, Pérez-González B, Berenguer Balaguer C, Raurell-Vila H, Chowdhury M, Corripio R, Partelli S, López-Bigas N, Pellegrini S, Montanya E, Nacher M, Falconi M, Layer R, Rovira M, González-Pérez A, Piemonti L, Pasquali L. Ramos-Rodríguez M, et al. Cell Genom. 2024 Aug 14;4(8):100604. doi: 10.1016/j.xgen.2024.100604. Epub 2024 Jul 2. Cell Genom. 2024. PMID: 38959898 Free PMC article.
JMnorm: a novel joint multi-feature normalization method for integrative and comparative epigenomics.
Xiang G, Guo Y, Bumcrot D, Sigova A. Xiang G, et al. Nucleic Acids Res. 2024 Jan 25;52(2):e11. doi: 10.1093/nar/gkad1146. Nucleic Acids Res. 2024. PMID: 38055833 Free PMC article.
Activity of the pleiotropic drug resistance transcription factors Pdr1p and Pdr3p is modulated by binding site flanking sequences.
Buechel ER, Pinkett HW. Buechel ER, et al. FEBS Lett. 2024 Jan;598(2):169-186. doi: 10.1002/1873-3468.14762. Epub 2023 Nov 8. FEBS Lett. 2024. PMID: 37873734 Free PMC article.
Missense mutations in CRX homeodomain cause dominant retinopathies through two distinct mechanisms.
Zheng Y, Sun C, Zhang X, Ruzycki PA, Chen S. Zheng Y, et al. Elife. 2023 Nov 14;12:RP87147. doi: 10.7554/eLife.87147. Elife. 2023. PMID: 37963072 Free PMC article.
Colorectal Adenoma Subtypes Exhibit Signature Molecular Profiles: Unique Insights into the Microenvironment of Advanced Precancerous Lesions for Early Detection Applications.
Mancuso FM, Higareda-Almaraz JC, Canal-Noguer P, Bertossi A, Perera-Lluna A, Roehrl MHA, Kruusmaa K. Mancuso FM, et al. Cancers (Basel). 2025 Feb 14;17(4):654. doi: 10.3390/cancers17040654. Cancers (Basel). 2025. PMID: 40002249 Free PMC article.

See all "Cited by" articles

References

1. Domanska D. et al. (2018) Mind the gaps: overlooking inaccessible regions confounds statistical testing in genome analysis. BMC Bioinformatics., 19, 481. - PMC - PubMed
1. Durinck S. et al. (2005) BioMart and bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics, 21, 3439–3440. - PubMed
1. Frankish A. et al. (2021) gencode 2021. Nucleic Acids Res., 49, D916–D923. - PMC - PubMed
1. Khatri P., Drăghici S. (2005) Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics, 21, 3587–3595. - PMC - PubMed
1. Kinsella R.J. et al. (2011) Ensembl BioMarts: a hub for data retrieval across taxonomic space. Database, 2011, bar030. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

National Center for Tumor Diseases (NCT) Molecular Precision Oncology Program.

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

rGREAT: an R/bioconductor package for functional enrichment on genomic regions

Affiliations

rGREAT: an R/bioconductor package for functional enrichment on genomic regions

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources