Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Jun 14:17:821-831.
doi: 10.1016/j.csbj.2019.06.012. eCollection 2019.

Computational Biology Solutions to Identify Enhancers-target Gene Pairs

Affiliations
Review

Computational Biology Solutions to Identify Enhancers-target Gene Pairs

Judith Mary Hariprakash et al. Comput Struct Biotechnol J. .

Abstract

Enhancers are non-coding regulatory elements that are distant from their target gene. Their characterization still remains elusive especially due to challenges in achieving a comprehensive pairing of enhancers and target genes. A number of computational biology solutions have been proposed to address this problem leveraging the increasing availability of functional genomics data and the improved mechanistic understanding of enhancer action. In this review we focus on computational methods for genome-wide definition of enhancer-target gene pairs. We outline the different classes of methods, as well as their main advantages and limitations. The types of information integrated by each method, along with details on their applicability are presented and discussed. We especially highlight the technical challenges that are still unresolved and hamper the effective achievement of a satisfactory and comprehensive solution. We expect this field will keep evolving in the coming years due to the ever-growing availability of data and increasing insights into enhancers crucial role in regulating genome functionality.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Timeline of the enhancer-target gene pairing algorithms. The main methods described in the review (tool name in bold, if defined) are listed to highlight the timeline of their publication over the years (horizontal axis).
Fig. 2
Fig. 2
Features used in ETG pairing tools. The figure summarizes the main types of features used to define ETG pairs by the tools discussed in this review. For each feature, its respective frequency (y-axis, number of methods) and first adoption by the tools discussed in this review (x-axis, year) is reported. The size of each dot is also proportional to the frequency (number of methods). The colors represent the category of the data: genomic annotations independent to cell type (dark green); epigenomics data (orange); transcriptomic data (mauve).
Fig. 3
Fig. 3
Main classes of ETG pairing methods. The cartoon highlights the main principles underlying the four main classes of ETG pairing methods as discussed in this review. (a) Correlation-based methods are centered on assessing the correlation between activity of individual enhancer-promoter pairs across multiple cell types. Their activity is measured by one or more types of functional epigenomics or transcriptomics data. (b) Supervised learning-based methods instead build a predictor based on a known set of true positive and negative ETG pairs. For each of these, several features (e.g. functional genomics data) are considered to describe enhancers and promoters activity across multiple cell types. These can also be enriched with other features directly associated to the ETG pair, such as their genomic distance or synteny conservation. (c) Regression-based methods are simultaneously assessing the quantitative contribution to a promoter activity by multiple enhancers within the considered genomic window. These methods leverage a large number of genomic features and functional data. Regression methods can provide a weight for the contribution of individual enhancers (represented by ETG pairing lines of different thickness in the cartoon). (d) Score-based methods integrate into a single quantitative score information from a large set of genomic features and functional data. The score is quantifying the strength of individual ETG pairs. In the cartoons for all methods enhancers and promoters are represented as boxes labelled as “E” or “P”, respectively. TSS is marked with an arrow. Colored (purple or green) curves are used to represent quantitative functional genomics data used to infer the activity level of enhancers or promoters, respectively. They are meant to hint the peaks of various intensity that would be associated to such features in genomics data such as ChIP-seq.

References

    1. De Laat W., Duboule D. Topology of mammalian developmental enhancers and their regulatory landscapes. Nature. 2013;502:499–506. - PubMed
    1. Pennacchio L.A., Bickmore W., Dean A., Nobrega M.A., Bejerano G. Enhancers: five essential questions. Nat Rev Genet. 2013;14:288–295. - PMC - PubMed
    1. Kim T.-K., Shiekhattar R. Architectural and functional commonalities between enhancers and promoters. Cell. 2015;162:948–959. - PMC - PubMed
    1. Sanyal A., Lajoie B.R., Jain G., Dekker J. The long-range interaction landscape of gene promoters. Nature. 2012;489:109–113. - PMC - PubMed
    1. Ong C.T., Corces V.G. Enhancer function: new insights into the regulation of tissue-specific gene expression. Nat Rev Genet. 2011;12:283–293. - PMC - PubMed