Next-generation genomics: an integrative approach

R David Hawkins¹, Gary C Hon, Bing Ren

Affiliations

Affiliation

¹ Ludwig Institute for Cancer Research, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093-0653, USA.

PMID: 20531367
PMCID: PMC3321268
DOI: 10.1038/nrg2795

Review

Next-generation genomics: an integrative approach

R David Hawkins et al. Nat Rev Genet. 2010 Jul.

. 2010 Jul;11(7):476-86.

doi: 10.1038/nrg2795.

Authors

R David Hawkins¹, Gary C Hon, Bing Ren

Affiliation

¹ Ludwig Institute for Cancer Research, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093-0653, USA.

PMID: 20531367
PMCID: PMC3321268
DOI: 10.1038/nrg2795

Abstract

Integrating results from diverse experiments is an essential process in our effort to understand the logic of complex systems, such as development, homeostasis and responses to the environment. With the advent of high-throughput methods--including genome-wide association (GWA) studies, chromatin immunoprecipitation followed by sequencing (ChIP-seq) and RNA sequencing (RNA-seq)--acquisition of genome-scale data has never been easier. Epigenomics, transcriptomics, proteomics and genomics each provide an insightful, and yet one-dimensional, view of genome function; integrative analysis promises a unified, global view. However, the large amount of information and diverse technology platforms pose multiple challenges for data access and processing. This Review discusses emerging issues and strategies related to data integration in the era of next-generation genomics.

PubMed Disclaimer

Figures

**Figure 1. Annotating the genome through detecting transcription factor binding sites and histone modification states**
Promoters can be mapped by the localization of general transcription machinery and transcription factors (TF) such as RNA polymerase II (Pol II) or TAF1, or by the localization H3K4me3. The bodies of transcribed genes and noncoding RNAs are marked by H3K36me3. Enhancers can be found by distal transcription factor (TF) binding sites or by H3K4me1. This modification often coincides with H3K4me2, which has been shown to be necessary to recruit pioneering transcription factors to enhancer elements. In addition, H3K4me1 sites overlap acetylated histone lysines, in agreement with acetylation islands outside of promoters identifying functional enhancer elements^,. Insulators are bound by CTCF. Nucleosomes are shown as cylinders and example histone tails are in grey. Different TFs are shown in different colours. Factors bound to the insulator include CTCF and subunits cohesion.

**Figure 2. Identification of regulatory SNPs (rSNPs)**
The sequence of a transcription factor (TF) binding site is shown with the position of an A/T polymorphism. By integrating chromatin signatures of enhancers or transcription factor binding sites with SNP data, SNPs falling with the region would be predicted as rSNPs. These could then be correlated to changes in gene expression.

**Figure 3. Data Visualization**
The UCSC Genome Browser is a tool for viewing genomic datasets. A vast amount of data is available for viewing through this browser. This example from the browser shows numerous data types, in K562 cells, from the ENCODE Consortium. A random gene was selected - *KATNAL1* - that illustrates several points that can be identified by using this tool. The promoter has a typical chromatin structure (peak of H3K4me3 between the bimodal peaks of H3K4me1), is bound by Pol II, and is Dnase hypersensitive. The gene is transcribed, as indicated by RNA-Seq data, as well as H3K36me3 localization. The gene lies between two CTCF bound sites that could be tested for insulator activity. An intronic H3K4me1 peak (highlighted) predicts an enhancer element, corroborated by the DHS peak. There is a broad repressive domain of H3K27me3 downstream, which could have an open chromatin structure in another cell type.

**Figure 4. Flow chart for data analysis**
This example of shows a workflow for ChIP-seq data analysis that can be done by bench scientist using current resources is shown. A similar strategy could be used for other types of NGS data. Blue boxes show steps that can be performed using Galaxy. Integration or cross-sectioning of data can often be done in the UCSC browser or by joining list in Galaxy (Purple box). Downstream steps such known motif analysis and gene ontology (GO) analysis can be achieved with online or stand alone tools (Red boxes). Galaxy can also be used to establish analytical pipelines for calling SNPs that could then be integrated into sequencing-based data such as ChIP-Seq.

See this image and copyright information in PMC

References

1. Licatalosi DD, Darnell RB. RNA processing and its regulation: global insights into biological networks. Nat Rev Genet. 2010;11:75–87. - PMC - PubMed
1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10:57–63. - PMC - PubMed
1. Farnham PJ. Insights from genomic profiling of transcription factors. Nat Rev Genet. 2009;10:605–16. - PMC - PubMed
1. Park PJ. ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet. 2009;10:669–80. - PMC - PubMed
1. Metzker ML. Sequencing technologies - the next generation. Nat Rev Genet. 2010;11:31–46. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Next-generation genomics: an integrative approach

Affiliation

Next-generation genomics: an integrative approach

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources