Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Sep 6;103(3):421-430.
doi: 10.1016/j.ajhg.2018.07.011. Epub 2018 Aug 9.

Characterization of a Human-Specific Tandem Repeat Associated with Bipolar Disorder and Schizophrenia

Affiliations

Characterization of a Human-Specific Tandem Repeat Associated with Bipolar Disorder and Schizophrenia

Janet H T Song et al. Am J Hum Genet. .

Abstract

Bipolar disorder (BD) and schizophrenia (SCZ) are highly heritable diseases that affect more than 3% of individuals worldwide. Genome-wide association studies have strongly and repeatedly linked risk for both of these neuropsychiatric diseases to a 100 kb interval in the third intron of the human calcium channel gene CACNA1C. However, the causative mutation is not yet known. We have identified a human-specific tandem repeat in this region that is composed of 30 bp units, often repeated hundreds of times. This large tandem repeat is unstable using standard polymerase chain reaction and bacterial cloning techniques, which may have resulted in its incorrect size in the human reference genome. The large 30-mer repeat region is polymorphic in both size and sequence in human populations. Particular sequence variants of the 30-mer are associated with risk status at several flanking single-nucleotide polymorphisms in the third intron of CACNA1C that have previously been linked to BD and SCZ. The tandem repeat arrays function as enhancers that increase reporter gene expression in a human neural progenitor cell line. Different human arrays vary in the magnitude of enhancer activity, and the 30-mer arrays associated with increased psychiatric disease risk status have decreased enhancer activity. Changes in the structure and sequence of these arrays likely contribute to changes in CACNA1C function during human evolution and may modulate neuropsychiatric disease risk in modern human populations.

Keywords: CACNA1C; GWAS; bipolar disorder; chimpanzee; copy-number variation; human evolution; minisatellite; psychiatric disease; schizophrenia; tandem repeat.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Human-Specific Tandem Repeat Region Is Composed of 30-mer Sequence Units Repeated Head-to-Tail in Multi-kilobase Arrays (A) The tandem repeat is located in the third intron of CACNA1C. The human reference assembly predicts ten copies of the 30 bp segment while chimpanzees and other simians have a single copy of the 30 bp segment. More distantly related placental mammals, out to Afrotheria, have a region that aligns to the 30 bp segment, but with insertions or deletions. There is an abnormally large number of genomic DNA sequencing reads mapping to the tandem repeat region, consistent with this repeat being further expanded in human individuals. The repeat region also shows enrichment for p300/CBP binding and DNase I hypersensitivity in the developing human brain. (B) We performed Southern blot analysis on 18 human individuals by probing for the 30 bp repeat after digesting with BlpI. We also included two controls: mouse DNA (no orthologous sequence) and the 8 kb vector from which the probe was transcribed. The human reference genome predicts a BlpI fragment of approximately 900 bp. In contrast, all humans tested show much larger BlpI fragment sizes (4,000 to 35,000 bp), and many individuals show dual bands indicating distinct alleles at the locus. (C) Frequency distribution of 362 repeat allele lengths detected by Southern blot analysis. (D) The 30-mer sequence logo calculated from the 30-mer variants present in human repeat arrays that were sequenced with long-read (PacBio) technology. Some positions are nearly invariant, whereas others vary from 30-mer to 30-mer. (E) Structure and composition of tandem repeat arrays sequenced by PacBio long-read technology. Each row represents a different sequenced array, and each color represents a distinct 30-mer variant. Black regions indicate gaps that we have introduced to maximize repeat alignments between arrays. Many regions are organized similarly in all arrays, but common variable regions distinguish array subtypes.
Figure 2
Figure 2
30-mer Repeat Variants Are Associated with Protective or Risk Status at GWAS SNPs Linked to Neuropsychiatric Disease (A) Genome browser view of the third intron of CACNA1C. A red line marks the location of the repeat region. The human-specific 30-mer repeats are embedded in a region defined by four SNPs that are repeatedly associated with BD and SCZ. (B) We identified individuals from the 1000 Genomes Project that have the protective genotype at all four GWAS SNPs (protective haplotype) and individuals with the risk genotype at all four GWAS SNPs (risk haplotype). We used only European and East Asian individuals because GWASs have only been done with these populations. For each possible 30-mer repeat unit, we determined what fraction of 30-mers in the reads that map to this locus in each individual exactly match that particular variant. The 30-mer sequence on the left is significantly associated with the protective haplotype (“prot”), whereas the 30-mer variant on the right is significantly associated with the risk haplotype (“risk”). Base pair differences between the two 30-mer variants presented here are underlined. Shown are standard box-and-whisker plots where the box represents the lower quartile, median, and upper quartile, and the whiskers represent the range of the measurements. Outliers (“+”) are data points that are outside the nearest quartile + 1.5× the interquartile range. (C) The table lists the mean and standard deviation of the fraction of reads that exactly match a given 30-mer for individuals with the protective or risk haplotype. Repeats enriched in the protective haplotype group are shown in yellow, and repeats enriched in the risk haplotype group are shown in purple. The p values were calculated using the Wilcoxon rank-sum test with Bonferroni correction (see Supplemental Methods).
Figure 3
Figure 3
Human-Specific Repeat Arrays Act as Enhancers in Neural Cells (A) The single 30-mer found in chimpanzees (30 bp) and 21 different human repeat arrays (3.5–6 kb) were cloned upstream of a minimal promoter driving expression of the luciferase reporter gene. 30-mer variants significantly associated with the protective haplotype are colored yellow, 30-mer variants significantly associated with the risk haplotype are purple, and non-significant variants are black. The fraction of total 30-mer variants associated with either the risk or the protective haplotype varies between the protective-associated and risk-associated repeat arrays as expected, although the differences are subtle. (B) Constructs were assayed for luciferase activity in a human neural progenitor cell line (ReNcell Cx), as described in Supplemental Methods. Human repeat alleles drove significantly higher luciferase activity compared to the single 30-mer found in chimpanzees (p < 10−8). Protective arrays drove significantly higher luciferase activity than risk arrays (p = 0.001). The p values were calculated using the Wilcoxon rank-sum test.

References

    1. Saha S., Chant D., Welham J., McGrath J. A systematic review of the prevalence of schizophrenia. PLoS Med. 2005;2:e141. - PMC - PubMed
    1. Merikangas K.R., Jin R., He J.-P., Kessler R.C., Lee S., Sampson N.A., Viana M.C., Andrade L.H., Hu C., Karam E.G. Prevalence and correlates of bipolar spectrum disorder in the world mental health survey initiative. Arch. Gen. Psychiatry. 2011;68:241–251. - PMC - PubMed
    1. GBD 2015 Disease and Injury Incidence and Prevalence Collaborators Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet. 2016;388:1545–1602. - PMC - PubMed
    1. Krishnan K.R. Psychiatric and medical comorbidities of bipolar disorder. Psychosom. Med. 2005;67:1–8. - PubMed
    1. Baldessarini R.J., Pompili M., Tondo L. Suicide in bipolar disorder: Risks and management. CNS Spectr. 2006;11:465–471. - PubMed

Publication types

MeSH terms

Substances