Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;9(8):e1003118.
doi: 10.1371/journal.pcbi.1003118. Epub 2013 Aug 8.

Software for computing and annotating genomic ranges

Affiliations

Software for computing and annotating genomic ranges

Michael Lawrence et al. PLoS Comput Biol. 2013.

Abstract

We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Tabular (top) and visual (bottom) representation of the exons for the human KRAS gene, derived from the UCSC known gene annotation.
In the table, the columns seqnames, start and end locate the exons in the genome. The strand column indicates the direction of transcription. The exons are grouped into transcripts by tx_id, and the exon IDs are given by exon_id. Virtually all genomic data sets fit this pattern: genomic location, followed by a series of columns, often including strand and/or score, that annotate that location. In the plot, the rectangles represent exonic regions, and the arrows represent the introns, as well as the strand.
Figure 2
Figure 2. Illustration of the reduce and disjoin operations on the last exon from each of the KRAS transcripts.
Figure 3
Figure 3. Illustration of overlap (top) and adjacency (bottom) relationships.
The any mode detects hits with partial or complete overlap, while within requires that the query range represents a subregion of the subject range.
Figure 4
Figure 4. Illustration of overlap computations between two GRangesList objects.
Each set of rectangles linked by solid lines represents a compound range, i.e., an element of the list. Ranges in the query (top) are being matched against ranges in the subject (bottom). The labels between them indicate the type of overlap (any, within, none).
Figure 5
Figure 5. Visualization of the coverage of bases by GFP- and CTCF-bound fragments (top) in the context of part of the gene model for Rrp1, Entrez gene 18114 (bottom).
Figure 6
Figure 6
Top panels: distributions of alternate nucleotide proportions for on- and off-SNP allele-dependent CTCF binding events. Bottom panels: relationships between average call quality values and alternate nucleotide proportions are depicted using a 2D density estimate (darker regions correspond to higher density.).

References

    1. Gentleman RC, Carey VJ, Bates DM, et al. (2004) Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology 5: R80. - PMC - PubMed
    1. Quinlan AR, Hall IM (2010) BEDTools: a exible suite of utilities for comparing genomic features. Bioinformatics 26: 841–842. - PMC - PubMed
    1. Ji H, Jiang H, Ma W, Johnson DS, Myers RM, et al. (2008) An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nature Biotechnology 26: 1293–1300. - PMC - PubMed
    1. Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms. Cambridge, Mass: MIT Press.
    1. Allen JF (1983) Maintaining knowledge about temporal intervals. Communications of the ACM 26: 832–843.

Publication types