Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 28;46(4):1661-1673.
doi: 10.1093/nar/gkx1266.

A survey of localized sequence rearrangements in human DNA

Affiliations

A survey of localized sequence rearrangements in human DNA

Martin C Frith et al. Nucleic Acids Res. .

Abstract

Genomes mutate and evolve in ways simple (substitution or deletion of bases) and complex (e.g. chromosome shattering). We do not fully understand what types of complex mutation occur, and we cannot routinely characterize arbitrarily-complex mutations in a high-throughput, genome-wide manner. Long-read DNA sequencing methods (e.g. PacBio, nanopore) are promising for this task, because one read may encompass a whole complex mutation. We describe an analysis pipeline to characterize arbitrarily-complex 'local' mutations, i.e. intrachromosomal mutations encompassed by one DNA read. We apply it to nanopore and PacBio reads from one human cell line (NA12878), and survey sequence rearrangements, both real and artifactual. Almost all the real rearrangements belong to recurring patterns or motifs: the most common is tandem multiplication (e.g. heptuplication), but there are also complex patterns such as localized shattering, which resembles DNA damage by radiation. Gene conversions are identified, including one between hemoglobin gamma genes. This study demonstrates a way to find intricate rearrangements with any number of duplications, deletions, and repositionings. It demonstrates a probability-based method to resolve ambiguous rearrangements involving highly similar sequences, as occurs in gene conversion. We present a catalog of local rearrangements in one human cell line, and show which rearrangement patterns occur.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Left: sketch of sequence evolution by complex mutations. (A) An ancestral sequence; (B) and (C) are derived. Same-color blocks (e.g. 3 and 4 in A) indicate similar sequences. The solid red block is ‘spontaneously generated’, i.e. not descended from any ancestral sequence. Right: alignment between A and B. Dashed lines indicate: duplication (vertical orange), deletion (vertical teal), and spontaneous generation (horizontal red).
Figure 2.
Figure 2.
Orphan rearrangements. (AD) R9.4 tandem duplications (TDs). (EH) R9.4 non-TD rearrangements. (IL) P5-C3 tandem duplications. (MP) P5-C3 non-TD rearrangements. Diagonal lines indicate alignments between a segment of a DNA read (vertical), and a segment of the reference human genome (horizontal). Red lines indicate same-strand alignments; blue lines indicate opposite-strand alignments. The vertical stripes indicate features in the reference genome; pink: forward-strand transposable element, blue: reverse-strand transposable element, purple: low-complexity or tandem repeat, green: exon.
Figure 3.
Figure 3.
Examples of tandem multiplication. Please see the description of Figure 2. Here, each of (A), (B) and (C) shows two R9.4 DNA reads, one above the other.
Figure 4.
Figure 4.
Examples of recurring rearrangement patterns. Diagonal lines indicate alignments between a segment of a DNA read (vertical), and a segment of the reference human genome (horizontal). Red lines indicate same-strand alignments; blue lines indicate opposite-strand alignments. The vertical stripes indicate features in the reference genome; pink: forward-strand transposable element, blue: reverse-strand transposable element, purple: low-complexity or tandem repeat, green: exon, dark green: protein-coding sequence. Some of the alignments (diagonal lines) are tiny: it may help to view this on a screen and zoom in.
Figure 5.
Figure 5.
A segment of a DNA read aligned to HBG1 and HBG2. It is likely that this part of the read is paralogous to HBG1, and should rather be aligned to HBG2.
Figure 6.
Figure 6.
Unique rearrangements. Please see the description of Figure 4.
Figure 7.
Figure 7.
Lengths of rearrangements in the R9.4 dataset.
Figure 8.
Figure 8.
Number of R9.4 rearrangements that overlap variants in DGV.

References

    1. Hastings P.J., Lupski J.R., Rosenberg S.M., Ira G.. Mechanisms of change in gene copy number. Nat. Rev. Genet. 2009; 10:551–564. - PMC - PubMed
    1. Zhang C.Z., Leibowitz M.L., Pellman D.. Chromothripsis and beyond: rapid genome evolution from complex chromosomal rearrangements. Genes Dev. 2013; 27:2513–2530. - PMC - PubMed
    1. Collins R.L., Brand H., Redin C.E., Hanscom C., Antolik C., Stone M.R., Glessner J.T., Mason T., Pregno G., Dorrani N. et al. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome. Genome Biol. 2017; 18:36. - PMC - PubMed
    1. Greer S.U., Nadauld L.D., Lau B.T., Chen J., Wood-Bouwens C., Ford J.M., Kuo C.J., Ji H.P.. Linked read sequencing resolves complex genomic rearrangements in gastric cancer metastases. Genome Med. 2017; 9:57. - PMC - PubMed
    1. Stephens P.J., Greenman C.D., Fu B., Yang F., Bignell G.R., Mudie L.J., Pleasance E.D., Lau K.W., Beare D., Stebbings L.A. et al. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell. 2011; 144:27–40. - PMC - PubMed

Publication types

LinkOut - more resources