Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Sep 19;8(9):e75146.
doi: 10.1371/journal.pone.0075146. eCollection 2013.

NuChart: an R package to study gene spatial neighbourhoods with multi-omics annotations

Affiliations

NuChart: an R package to study gene spatial neighbourhoods with multi-omics annotations

Ivan Merelli et al. PLoS One. .

Abstract

Long-range chromosomal associations between genomic regions, and their repositioning in the 3D space of the nucleus, are now considered to be key contributors to the regulation of gene expression and important links have been highlighted with other genomic features involved in DNA rearrangements. Recent Chromosome Conformation Capture (3C) measurements performed with high throughput sequencing (Hi-C) and molecular dynamics studies show that there is a large correlation between colocalization and coregulation of genes, but these important researches are hampered by the lack of biologists-friendly analysis and visualisation software. Here, we describe NuChart, an R package that allows the user to annotate and statistically analyse a list of input genes with information relying on Hi-C data, integrating knowledge about genomic features that are involved in the chromosome spatial organization. NuChart works directly with sequenced reads to identify the related Hi-C fragments, with the aim of creating gene-centric neighbourhood graphs on which multi-omics features can be mapped. Predictions about CTCF binding sites, isochores and cryptic Recombination Signal Sequences are provided directly with the package for mapping, although other annotation data in bed format can be used (such as methylation profiles and histone patterns). Gene expression data can be automatically retrieved and processed from the Gene Expression Omnibus and ArrayExpress repositories to highlight the expression profile of genes in the identified neighbourhood. Moreover, statistical inferences about the graph structure and correlations between its topology and multi-omics features can be performed using Exponential-family Random Graph Models. The Hi-C fragment visualisation provided by NuChart allows the comparisons of cells in different conditions, thus providing the possibility of novel biomarkers identification. NuChart is compliant with the Bioconductor standard and it is freely available at ftp://fileserver.itb.cnr.it/nuchart.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Neighbourhood graph of the gene LMO2 according to the Lieberman-Aiden et al.
Hi-C experiment.
Figure 2
Figure 2. Representation of the OCT4 (official name POU5F1) neighbourhood graphs according to the Dixon et al. experiments with multi-omics annotations.
In particular, panel a) represents CTCF binding sites mapping, panel b) cryptic RSSs mapping, panel c) isochores mapping, and panel d) DNase hypersensitive sites mapping.
Figure 3
Figure 3. Normalization of chromosome 17 Hi-C data according to the Lieberman-Aiden et al. experiment.
In panel a) the Hu et al. normalization is shown, while in panel b) the read-based normalization performed with NuChart (threshold 0.9) is presented to show the reproducibility with respect to the Hu et al. approach. Panel c) represents the NuChart read-based normalization performed using a more restrictive threshold (threshold 0.99).
Figure 4
Figure 4. Neighbourhood graph of the gene BRCA1 according to the Lieberman-Aiden et al. experiment.
Gene expression data about colon cancer experiment GDS3160 have been mapped on the graph to show the enhanced description (and prediction) power that the graph representation has in relation to gene co-expression with respect to the approach relying on genomic coordinates.
Figure 5
Figure 5. Neighbourhood graph of the 17q21.32 cytoband concerning the Homeobox B cluster (HOXB) of genes according to the Lieberman-Aiden et al.
Hi-C experiment.
Figure 6
Figure 6. Representation of the OCT4 (official name POU5F1) neighbourhood graphs in four different runs from the Hi-C experiments of Dixon et al. to show inter and intra run modifications.
In the panel a) and b) on the top part of the figure, the sequencing runs are from human embryonic stem cells (hESC), while panel c) and d) are from human foetal lung fibroblasts (IMR-90).
Figure 7
Figure 7. Goodness of fit diagnostics charts for three topological features of the stochastic estimator for the HOXB cluster of genes neighbourhood graph according to the Lieberman-Aiden et al.
Hi-C experiment. The thick black line represents the real data concerning the analysed graph, while the boxplot shows the statistical properties of the estimator achieved by employing stochastic simulations. In the top panel the analysis of the estimated model in relation to the degree distribution of the HOXB neighbourhood graph; in the central panel the analysis of the estimated model in relation to weighted edge-wise shared partner statistic; in the bottom panel the analysis of the estimated model in relation to the minimum geodetic distance of the HOXB neighbourhood graph.
Figure 8
Figure 8. Simulation details (left) and related statistics (right) of the stochastic analysis about the two components estimator for the graph describing the HOXB cluster of genes according to the Lieberman-Aiden et al.
Hi-C experiment: in the top panel the topological distribution (edges) component; in the bottom panel the degree distribution (degree) component.

References

    1. Ling JQ, Hoffman AR (2007) Epigenetics of Long-Range Chromatin Interactions. Pediatr Res 61: 11R–16R. doi:10.1203/pdr.0b013e31804575db. PubMed: 17413850. - DOI - PubMed
    1. Schneider R, Grosschedl R (2007) Dynamics and interplay of nuclear architecture, genome organization, and gene expression. Genes Dev 21: 3027-3043. doi:10.1101/gad.1604607. PubMed: 18056419. - DOI - PubMed
    1. Phillips-Cremins JE, Corces VG (2013) Chromatin Insulators: Linking Genome Organization to Cellular Function. Mol Cell 50(4): 461-474. doi:10.1016/j.molcel.2013.04.018. PubMed: 23706817. - DOI - PMC - PubMed
    1. Dekker J, Rippe K, Dekker M, Kleckner N (2002) Capturing chromosome conformation. Science 295: 1306–1311. doi:10.1126/science.1067799. PubMed: 11847345. - DOI - PubMed
    1. Tolhuis B, Palstra RJ, Splinter E, Grosveld F, de Laat W (2002) Looping and interaction between hypersensitive sites in the active β-globin locus. Mol Cell 10: 1453–1465. doi:10.1016/S1097-2765(02)00781-5. PubMed: 12504019. - DOI - PubMed

Publication types

LinkOut - more resources