Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Sep;3(9):1807-18.
doi: 10.1371/journal.pcbi.0030181.

Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data

Affiliations

Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data

Can Alkan et al. PLoS Comput Biol. 2007 Sep.

Abstract

The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Composition of Human Centromeric DNA
(A–B) Represented are ∼171-bp monomers: (A) in HOR; (B) in monomeric tracks. The divergence of the higher-order monomers marked with the same subscript is less than 2%.
Figure 2
Figure 2. Flowchart of the HORdetect Algorithm
Given a WGS sequence library, we first extracted alpha-satellite monomers from WGS sequence reads; performed hierarchical clustering to group highly similar monomers; encoded each WGS read with a unique cluster ID; and merged similar pattern sets. WGS sequences with the same encoded pattern set are assembled via PHRAP. The corresponding sequence (contig) is analyzed (paired-end and adjacency analysis).
Figure 3
Figure 3. Novel HORs in Human
(A) Paired-end sequence confirmation. Mate-pairs corresponding to a previously undescribed human pattern set (predicted 8-mer higher-order array) are shown. Black lines represent the left and right end sequences of each insert mapping to the same repeating encoded pattern set (red bars); dashed lines correspond to the unsequenced portion of the insert (40 kb in this case). The majority of end-sequence pairs map to the same repeat, confirming long-range tandem repeat organization. (B) Adjacency statistics for the new human higher-order array. The adjacency statistics simply calculates the number of times a specified monomer within the WGS sequence read maps adjacently to another specified monomer within the predicted HOR unit. The table shows data for a new human HOR sequence in Chromosomes 14 and 22. (C) FISH mapping (fosmid probe 3219L12) of predicted human higher-order array against metaphase spread of human chromosomes shows signals specific to Chromosomes 14 and 22 centromeres. Multiple clones from this encoded pattern set (3343N10, 3361F03, 3355D04, and 3355D08) showed identical results.
Figure 4
Figure 4. Examples of Restriction Enzymatic Digestion on Primate Fosmid Clones Containing HOR Alpha-Satellite DNA
Partial and complete digestion of the fosmid chimpanzee clone CH1251-783f21 (HOR3; columns 2 and 3, respectively), chimpanzee clone CH1251-518E17 (HOR4; columns 4 and 5), and macaque fosmid clone MQAD-1143O3 (macaque-HOR; columns 6 and 7, respectively). Partial digests confirm HOR structure, while nearly complete digests confirm expected size of predominant repeat units within the array. Columns 1 and 8: log-2 DNA ladder and 1-kb ladder markers, respectively.
Figure 5
Figure 5. Nonhuman Primate Alpha-Satellite FISH
Chimpanzee fosmid probes (A) CH1251-2018k17 and (B) CH1251-1027N15 containing putative HOR alpha-satellite repeats showed specific centromeric and pericentromeric signals when hybridized to chimpanzee chromosomes. (C) Baboon probe (RPCI-100L5) and (D) macaque BAC (CHORI250-102K3) show a pancentromeric distribution when tested against metaphases from the corresponding species. Similar results obtained for all putative HORs identified from the macaque (unpublished data). All the reported FISH experiments were performed with high stringency: three washes with 0.1× SSC at a temperature of 60 °C.
Figure 6
Figure 6. Primate Phylogenetic Analyses of Alpha-Satellite Sequences
Neighbor-joining methods were used to construct (A) a phylogenetic tree of human monomeric alpha-satellite sequences from Chromosome 8 (blue); putative HOR sequences from human (red), chimp (cyan), and gibbon (gray); and random samples from macaque (yellow) and baboon (green); and (B) a phylogenetic tree comparing all human HORs versus macaque HOR sequences identified in this study; and (C) a phylogenetic tree comparing randomly ascertained alpha-satellite monomers from four different Old World monkey species. New World monkey alpha-satellite sequences (dark green) are included as an outgroup in these analyses. Bootstrap values (n = 100 replicates) greater than 75 are indicated on the branches.

References

    1. Mahtani MM, Willard HF. Pulsed-field gel analysis of alpha-satellite DNA at the human X chromosome centromere: High-frequency polymorphisms and array size estimate. Genomics. 1990;7:607–613. - PubMed
    1. Warburton P, Haaf T, Gosden J, Lawson D, Willard H. Characterization of a chromosome-specific chimpanzee alpha satellite subset: Evolutionary relationship to subsets on human chromosomes. Genomics. 1996;33:220–228. - PubMed
    1. Warburton PE, Willard HF. Interhomologue sequence variation of alpha satellite DNA from human chromosome 17: Evidence for concerted evolution along haplotypic lineages. J Mol Evol. 1995;41:1006–1015. - PubMed
    1. Willard HF, Waye JS. Chromosome-specific subsets of human alpha satellite DNA: Analysis of sequence divergence within and between chromosomal subsets and evidence for an ancestral pentameric repeat. J Mol Evol. 1987;25:207–214. - PubMed
    1. Lee C, Wevrick R, Fisher RB, Ferguson-Smith MA, Lin CC. Human centromeric DNAs. Hum Genet. 1997;100:291–304. - PubMed

Publication types

Associated data