Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 May 11;149(4):912-22.
doi: 10.1016/j.cell.2012.03.033. Epub 2012 May 3.

Evolution of human-specific neural SRGAP2 genes by incomplete segmental duplication

Affiliations

Evolution of human-specific neural SRGAP2 genes by incomplete segmental duplication

Megan Y Dennis et al. Cell. .

Abstract

Gene duplication is an important source of phenotypic change and adaptive evolution. We leverage a haploid hydatidiform mole to identify highly identical sequences missing from the reference genome, confirming that the cortical development gene Slit-Robo Rho GTPase-activating protein 2 (SRGAP2) duplicated three times exclusively in humans. We show that the promoter and first nine exons of SRGAP2 duplicated from 1q32.1 (SRGAP2A) to 1q21.1 (SRGAP2B) ∼3.4 million years ago (mya). Two larger duplications later copied SRGAP2B to chromosome 1p12 (SRGAP2C) and to proximal 1q21.1 (SRGAP2D) ∼2.4 and ∼1 mya, respectively. Sequence and expression analyses show that SRGAP2C is the most likely duplicate to encode a functional protein and is among the most fixed human-specific duplicate genes. Our data suggest a mechanism where incomplete duplication created a novel gene function-antagonizing parental SRGAP2 function-immediately "at birth" 2-3 mya, which is a time corresponding to the transition from Australopithecus to Homo and the beginning of neocortex expansion.

PubMed Disclaimer

Conflict of interest statement

CONFLICTS OF INTEREST

J.A.R. and L.S. are employees of Signature Genomic Laboratories, a subsidiary of PerkinElmer, Inc. E.E.E. is on the scientific advisory boards for Pacific Biosciences, Inc. and SynapDx Corp.

Figures

Figure 1
Figure 1. Genomic characterization and sequence resolution of SRGAP2 loci
(A) FISH analysis shows three distinct copies of SRGAP2 on metaphase human chromosome 1, compared to a single copy in chimpanzee and orangutan (see Figure 2A for location of FISH probe; Figure S1 and Table S1 for details of additional FISH assays). (B) SRGAP2 genomic loci were sequenced and assembled using a BAC library (CHORI-17) created from human haploid genomic source material (complete hydatidiform mole). The absence of allelic variation allowed paralogous sequences to be resolved with high confidence based on near-perfect sequence identity overlap (>99.9%). (C) Regions highly identical to the reference genome (GRCh37/hg19) are colored in red (identity = 99.8–100%) and orange (99.6–99.8%), while regions completely absent from the current assembly are shaded gray (with region sizes indicated). Arrows show the orientation of the reference genome sequence with respect to the contigs (e.g., a left directional arrow indicates the reverse strand) indicating that even the ancestral (SRGAP2A) gene locus was missing sequence data, misassembled, and incorrectly orientated over 400 kbp of the current high-quality reference assembly. Genomic coordinates correspond to the representative human reference region with corresponding genes within these regions mapped along each contig.
Figure 2
Figure 2. Evolutionary characterization of SRGAP2 duplications
(A) A depiction of the gene structure of SRGAP2 with respect to the three assembled contigs. Homologous segments are shown using Miropeats (Parsons, 1995) where green lines indicate nearly identical segments (s = 1,000) shared between SRGAP2A and the duplicate SRGAP2 paralogs; blue lines delineate the larger (>515 kbp) extent of homology between SRGAP2B and SRGAP2C. The 244.2 kbp genomic region shared among all three contigs is highlighted (red box) with clusters of Alu repeats at the breakpoints (arrows). Also see Figure S2 for detailed representation of Alu elements and segmental duplications across duplicated regions. (B) An unrooted neighbor-joining tree was constructed based on a 244.2 kbp multiple sequence alignment of the three loci. Both 1p12 and 1q21.1 branches show accelerated rates of substitution (p = 0.00001 and p = 0.0249; Tajima’s relative rate test). The actual (no parentheses) and adjusted (parentheses) number of substitutions for locus-specific acceleration is indicated above each branch along with the bootstrap support at each node. We estimate the timing assuming chimpanzee and human diverged 6 mya. Also see Table S2 for molecular evolution of the shared SRGAP2 coding regions. (C) FISH experiments on metaphase human chromosome 1, as well as the orthologous chimpanzee and orangutan chromosomes, were performed to discern the order of duplication events. Locations of probes with respect to the contigs are shown in part (A). A probe (yellow) targeting sequence adjacent to the original SRGAP2 duplicate region hybridizes to 1q21.1 in chimpanzee and orangutan, suggesting the original SRGAP2 duplicate paralog maps to the region homologous with nonhuman primate 1q21.1. A probe (green) targeting unique sequence on the p-arm of chromosome 1 proximal to SRGAP2C hybridizes to the chromosome 1p-arm in orangutan, refuting the possibility that SRGAP2C moved to the p-arm via a simple pericentromeric inversion (Szamalek et al., 2006) and distinguishing the p-arm from the genomic region at 1q21.1 where the original SRGAP2 duplicate paralog maps. A probe (blue) was used to distinguish the chromosome 1q-arm.
Figure 3
Figure 3. Paralog-specific SRGAP2 gene expression
(A) Long-range RT-PCR products from pooled fetal brain RNA are shown next to the gene models. A single band was amplified from the ancestral paralog, while three bands were amplified from duplicate paralogs using primers designed to target alternative isoforms. 96 cDNA transcripts were cloned and sequenced. (B) Fixed paralog-specific variants were used to assign transcripts to respective genomic loci allowing both polymorphic and fixed putative amino acid changes to be deduced. Exonic sequence specific to the ancestral copy (SRGAP2A; green) and the duplicate loci (SRGAP2B/C/D; purple) are shown. The locations of stop codons encoded by isoforms missing exons are represented with an “x”. Exons missing from transcripts are indicated (diagonal lines) and likely correspond to the genomic deletion within SRGAP2D in the case of the exon 2–3 deleted isoform. (C) Paralog-specific expression profiling was performed using RNA-Seq data mapped to unique sequence identifiers. The specificity of next-generation sequence data and the determination of fixed single base-pair difference between the copies was necessary to tease apart the expression profiles of these virtually identical copies. Chimpanzee and macaque RNA-Seq data affirm the specificity of this assay. Also see Figure S3 and Table S3 for additional expression results.
Figure 4
Figure 4. SRGAP2 copy-number diversity in human populations
(A) Diploid copy-number estimates of SRGAP2 paralogs for 661 sequenced human genomes from 14 distinct populations (1000 Genomes Project) and from nonhuman primates. (B) SRGAP2A and SRGAP2C paralogs clearly are fixed at a copy number of 2, while SRGAP2B is polymorphic showing four distinct copy-number states. Note, we also detect polymorphism for SRGAP2D and have identified individuals homozygously deleted for this paralog. (C) FISH validation of three HapMap individuals genotyped for SRGAP2B [circled in red in part (A)]. All samples falling at the lower and upper tails of copy-number distributions for all three paralogs were experimentally genotyped using a paralog-specific qPCR assay; in all cases, SRGAP2A and SRGAP2C were validated as diploid copy number 2. Also refer to Figure S5.
Figure 5
Figure 5. Model for SRGAP2 evolution
Schematic depicts location and orientation (blue triangles) of SRGAP2 paralogs on human chromosome 1 with putative protein products indicated above each based on cDNA sequencing. Arrows trace the evolutionary history of SRGAP2 duplication events. Copy-number polymorphism and expression analyses suggest both paralogs at 1q21.1 (SRGAP2B and SRGAP2D) are pseudogenes, whereas the 1q32.1 (SRGAP2A) and 1p12 (SRGAP2C) paralogs are likely to encode functional proteins.

Comment in

References

    1. Antonacci F, Kidd JM, Marques-Bonet T, Teague B, Ventura M, Girirajan S, Alkan C, Campbell CD, Vives L, Malig M, et al. A large and complex structural polymorphism at 16p12.1 underlies microdeletion disease risk. Nat Genet. 2010;42:745–750. - PMC - PubMed
    1. Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li PW, Eichler EE. Recent segmental duplications in the human genome. Science. 2002;297:1003–1007. - PubMed
    1. Bailey JA, Liu G, Eichler EE. An Alu transposition model for the origin and expansion of human segmental duplications. Am J Hum Genet. 2003;73:823–834. - PMC - PubMed
    1. Biesecker LG, Mullikin JC, Facio FM, Turner C, Cherukuri PF, Blakesley RW, Bouffard GG, Chines PS, Cruz P, Hansen NF, et al. The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine. Genome Res. 2009;19:1665–1674. - PMC - PubMed
    1. Blekhman R, Marioni JC, Zumbo P, Stephens M, Gilad Y. Sex-specific and lineage-specific alternative splicing in primates. Genome Res. 2010;20:180–189. - PMC - PubMed

Publication types

Substances

LinkOut - more resources