Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 7;98(1):58-74.
doi: 10.1016/j.ajhg.2015.11.023. Epub 2015 Dec 31.

Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA

Affiliations

Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA

Tychele N Turner et al. Am J Hum Genet. .

Abstract

We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism.

PubMed Disclaimer

Figures

Figure 1
Figure 1
SNV and Indel Analysis (A) Venn diagram of SNV calls from FreeBayes and GATK and Venn diagram of indel calls from FreeBayes, GATK, and Platypus. (B) Concordance by sample between variant calls from FreeBayes, GATK, or their intersection (FreeBayes and GATK) and exome-chip data. Variants shown are those passing filters. These data were available for all samples (n = 232,961 variants, of which 44,732 were identified in the genome VCF file). SNP concordance with exome SNP microarrays was 99.80% ± 0.03%, 99.90% ± 0.01%, and 99.95% ± 0.03% for FreeBayes, GATK, and the intersection of FreeBayes and GATK call sets, respectively.
Figure 2
Figure 2
CNV Analysis (A) Venn diagram of CNV calls from dCGH, GenomeSTRiP, and VariationHunter. (B) Density of size for deletions and duplications. CNVs were initially merged if they had at least 25% reciprocal overlap and their breakpoints were <1,500 nucleotides away from each other at both ends. To get a very minimal set, we subsequently merged CNVs by using a greedy merge in BEDTools.
Figure 3
Figure 3
Paternal Age and De Novo Events (A) Paternal age for de novo SNVs and indels. Probands generally have older fathers. The number of variants was significantly correlated with paternal age (Pearson correlation p = 7.9 × 10−9, r = 0.59). The data fit a linear trend (adjusted r2 = 0.34) with advancing paternal age (p = 7.9 × 10−9) such that there are on average 1.4 [0.98, 1.9] de novo mutations for each year of a father’s life. In this study, the father’s age at the time of the child’s birth ranged from 29.4 to 57.9 years. (B) Paternal age for de novo SNVs and indels within the exome. The number of de novo events detected in each individual is plotted against the father’s age when the individual was born. Shown in blue are siblings, and in red are probands.
Figure 4
Figure 4
Exome versus Genome: GC Bias (A) Percentage of GC content in exome-specific sequencing and genome-specific sequencing regions within the exome. (B) Percentage of average GC content by sample of exome-specific sequencing and genome-specific sequencing regions within the exome. Genome-specific regions are defined as those at >10× coverage in the genome and <10× coverage in the exome. Exome-specific regions are defined as those at >10× coverage in the exome and <10× coverage in the genome. Genome-specific regions are higher in GC content (Wilcoxon p < 2.2 × 10−16).
Figure 5
Figure 5
Exome versus Genome: Gene Coverage (A) Number of genes that have regions covered in exome-specific sequencing or genome-specific sequencing regions of the exome and the number of samples in which they occur. Genome-specific regions of the exome add an additional ∼2 Mb (5%) of sequence, whereas exome-specific regions add an additional ∼40 kb. Genome sequencing detected 1,854 genes missing sequences in >90% of individuals targeted by exome sequencing, whereas exome sequencing identified only two genes missing sequences in >90% of samples targeted by genome sequencing. Among the genes ascertained only by WGS, those of interest in relation to autism include ACHE (MIM: 100740), AGAP2 (MIM: 605476), ARID1B, CACNA2D3 (MIM: 606399), DEAF1 (MIM: 602635), EFR3A (MIM: 611798), FOXP1 (MIM: 605515), LAMC3 (MIM: 604349), MYO1E (MIM: 601479), PRKAR1B, RANBP17 (MIM: 606141), RUFY3 (MIM: 611194), SHANK3 (MIM: 606230), and TRIO. (B) Density plots of the number of variants (intersection of FreeBayes and GATK) called in the exome-by-exome sequencing and by genome sequencing. Shown is the uniformity of calls in the genome data where only ancestry is a contributor to the difference between samples. Of note, exome data are exceedingly variable in the number of variant calls.
Figure 6
Figure 6
All Variants Shown for Each Family Were Validated by Appropriate Methods (A) Family 14153. The events are a de novo exonic deletion of the promoter and first exon of CANX and two de novo missense SNVs in CBL (MIM: 165360) and FAF2. The location of the de novo deletion is also shown with respect to CANX. (B) Family 13874. The event is a de novo exonic duplication in SAE1. Furthermore, we provided a mock representation of the de novo duplication with respect to SAE1. (C) Family 12793. The event is a promoter and exonic WNT7A deletion passed from the mother to the male proband. As shown in the mock representation of the inherited deletion, it removes the 5′ UTR and the first exon of WNT7A. This proband and the mother both have macrocephaly, which is in concordance with maternally inherited deletion of WNT7A. (D) Family 11572. The event is a DSCAM deletion encompassing CNS DNase I hypersensitive sites passed from the mother to the male proband. This individual also suffers from nonfebrile seizures, in concordance with disruption of DSCAM. (E) Family 13539. The event is a duplication upstream of TRIO. It encompasses CNS DNase I hypersensitive sites and was passed from the mother to the male proband. (F) Family 11804. A conserved missense de novo mutation (phyloP = 2.57) was found in PIK3CA. This individual has macrocephaly, which is in concordance with disruption of PIK3CA. (G) Family 11712. We found a maternally inherited rare, private exonic deletion of parts of DISC1 and a 35 kb paternally inherited rare intronic deletion of MBD5. The deletion affecting DISC1 is around 150 kb, deletes a few coding exons of this gene, and is not seen in over 15,000 genotyped control individuals. (H) Family 13122. We found two large rare deletions that intersect NTM and RBFOX1—one inherited from the father and the other inherited from the mother. The maternally inherited rare deletion is 240 kb and deletes most of the first intron of NTM. This is an extremely rare deletion, given that it was not observed in over 15,000 control individuals. The paternally inherited rare deletion is 170 kb and deletes an intron of RBFOX1. (I) Family 11709. We found three rare, private deletions affecting genes of interest. First, we found a 30 kb paternally inherited rare, private exonic deletion of CACNA2D4. We also found two maternally inherited rare, private deletions that affect genes of interest. One is a 120 kb exonic rare and private deletion less than 5 kb downstream of SCN2A, and the other is a 12 kb rare deletion of an intron of ARID1B.
Figure 7
Figure 7
Functional Analysis of CNS DNase I Hypersensitivity Sites in the DSCAM Deletion (A) Schematic of the 14 kb DSCAM deletion observed in family 11572. The diagram illustrates the 12 DNase I hypersensitivity sites (HSs) contained within the deletion, as well as the nine sequence intervals encompassing them. These sequence intervals were tested for their potential to direct reporter expression. (B–D) Bright-field images of representative 48 hpf mosaic zebrafish embryos injected with DSCAM-6 (B), DSCAM-2 (C), and DSCAM-1 (D). (E–G) tdTomato expression in representative 48 hpf mosaic embryos injected with DSCAM-6 (E), DSCAM-2 (F), and DSCAM-1 (G). Expression was seen in the forebrain (E–G), hindbrain (E–G), midbrain (E and F), spinal cord (F and G), and amacrine cells (F and G).

References

    1. Steffenburg S., Gillberg C., Hellgren L., Andersson L., Gillberg I.C., Jakobsson G., Bohman M. A twin study of autism in Denmark, Finland, Iceland, Norway and Sweden. J. Child Psychol. Psychiatry. 1989;30:405–416. - PubMed
    1. Bailey A., Le Couteur A., Gottesman I., Bolton P., Simonoff E., Yuzda E., Rutter M. Autism as a strongly genetic disorder: evidence from a British twin study. Psychol. Med. 1995;25:63–77. - PubMed
    1. Verkerk A.J., Pieretti M., Sutcliffe J.S., Fu Y.H., Kuhl D.P., Pizzuti A., Reiner O., Richards S., Victoria M.F., Zhang F.P. Identification of a gene (FMR-1) containing a CGG repeat coincident with a breakpoint cluster region exhibiting length variation in fragile X syndrome. Cell. 1991;65:905–914. - PubMed
    1. Amir R.E., Van den Veyver I.B., Wan M., Tran C.Q., Francke U., Zoghbi H.Y. Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2. Nat. Genet. 1999;23:185–188. - PubMed
    1. Iossifov I., O’Roak B.J., Sanders S.J., Ronemus M., Krumm N., Levy D., Stessman H.A., Witherspoon K.T., Vives L., Patterson K.E. The contribution of de novo coding mutations to autism spectrum disorder. Nature. 2014;515:216–221. - PMC - PubMed

Publication types