Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 24;18(1):403.
doi: 10.1186/s12864-017-3671-0.

Comprehensive whole genome sequence analyses yields novel genetic and structural insights for Intellectual Disability

Affiliations

Comprehensive whole genome sequence analyses yields novel genetic and structural insights for Intellectual Disability

Farah R Zahir et al. BMC Genomics. .

Abstract

Background: Intellectual Disability (ID) is among the most common global disorders, yet etiology is unknown in ~30% of patients despite clinical assessment. Whole genome sequencing (WGS) is able to interrogate the entire genome, providing potential to diagnose idiopathic patients.

Methods: We conducted WGS on eight children with idiopathic ID and brain structural defects, and their normal parents; carrying out an extensive data analyses, using standard and discovery approaches.

Results: We verified de novo pathogenic single nucleotide variants (SNV) in ARID1B c.1595delG and PHF6 c.820C > T, potentially causative de novo two base indels in SQSTM1 c.115_116delinsTA and UPF1 c.1576_1577delinsA, and de novo SNVs in CACNB3 c.1289G > A, and SPRY4 c.508 T > A, of uncertain significance. We report results from a large secondary control study of 2081 exomes probing the pathogenicity of the above genes. We analyzed structural variation by four different algorithms including de novo genome assembly. We confirmed a likely contributory 165 kb de novo heterozygous 1q43 microdeletion missed by clinical microarray. The de novo assembly resulted in unmasking hidden genome instability that was missed by standard re-alignment based algorithms. We also interrogated regulatory sequence variation for known and hypothesized ID genes and present useful strategies for WGS data analyses for non-coding variation.

Conclusion: This study provides an extensive analysis of WGS in the context of ID, providing genetic and structural insights into ID and yielding diagnoses.

Keywords: 1q43 microdeletion; ARID1B; CACNB3; Genome assembly; Intellectual Disability; PHF6; SPRY4; SQSTM1; UPF1; Whole genome sequencing.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Schematic of complete study design. Abbreviations: CNV = copy number variant; SV = structural variant; SNV = single nucleotide variant; DDD = Deciphering Developmental Disabilities study; UPP = Ubiquitin Proteolysis Pathway; IGV = Integrated Genome Viewer; DGV = Database of Genome Variation
Fig. 2
Fig. 2
Details of CNV analyses. a IGV images for heterozygous deletion CNV in patient 51, showing proximal and distal breakpoint. The CNV involves whole of CEP170 and part of SDCAAG8 genes. Top, middle and bottom panels are child’s .bam file, mother’s .bam file and father .bam file respectively. Read-depth coverage shows CNV is de novo (red ovals). b Cartoon of breakpoint junction seuqence showing a 24 bp chromosome 16 (green box) and 107 bp chromosome 5 sequence (yellow box) inserted between the proximal and distal breakpoints on chromosome 1q43. Yellow shaded segment shows sequnce microhomology- this 14 bp seuqence (TTGGGAGTAGAGGG) is found at chromosome 5:40,069,598-40,069,612 and at chromosome 1:243,447,747-243,447,761, hg19). Sanger sequence trace images are overlaid confirming the CNV breakpoint. Grey arrows denote PCR forward and reverse primers. N denotes DNA repeat sequence. c Genomic interval involved in the de novo CNV detected in patient 51- ucsc genome browser (hg19). Red highlighted box shows region involved in the deletion in our patient. Yellow boxes show critical region for 1q43-44 sydrome defined by Nagamani et al. Green box shows critical region as defined by Perlman et al. N.B, Nagamani et al. also highlight ZBTB18 (old name ZNG238) in their critical region
Fig. 3
Fig. 3
Validation study. a showing incidence for potentially damaging SNVs (PDSs) in both the positive control (UK10K) and negative control (1000G) control cohorts. * denotes statistical signficance (at p < 0.05, Fisher’s exact test) b and c Results of bootstrap analyses for PDSs in 6 randomly selected genes. Red vertical bar shows the mean and median result for PDSs in our 6 candidate genes
Fig. 4
Fig. 4
Venn driagram showing CNVs found by each algorithm (G = gain, L = loss)
Fig. 5
Fig. 5
Schematic of filtration pipeline for variants in non-coding regions. a Schematic for SNVs. b Schematic for CNVs. Abbeviations; SNV- single nucleotide variant, TFBS – Transcription Factor Binding Site, FANTOM-Enhancer sequence as annotated by the Fantom consortium. UTR – untranslated regions. DDD- Deciphering Development Disabilities. UPP – Ubiquitin proteosome degredation pathway. CN- copy number. Patient 42 had DVPRR in the UTRs of two genes; CBL and UBE3B. Patient 59 had a DVPRR in the promoter of UBE3A, patient 43 had a DVPRR in the promoter of CUL4B, and patient 42 had DVPRRs in the promoters of UBE3A, CUL4B and CUL7 (Additional file 10: Table S8)
Fig. 6
Fig. 6
a Sanger sequencing verification of translocation in patient 51, with karyotype cartoon of balanced translocation. PCR amplicon trace file shows sequence mapping across the chromosome X -2 translocation boundary. Zoomed-in view shows single base addition at breakpoint jucntion. b Circos plot showing mutation burden for patient 51 called by ABySS genome assembly

References

    1. Vissers LE, Gilissen C, Veltman JA. Genetic studies in intellectual disability and related disorders. Nat Rev Genet. 2016;17(1):9–18. doi: 10.1038/nrg3999. - DOI - PubMed
    1. Gilissen C, Hehir-Kwa JY, Thung DT, van de Vorst M, van Bon BW, Willemsen MH, Kwint M, Janssen IM, Hoischen A, Schenck A, et al. Genome sequencing identifies major causes of severe intellectual disability. Nature. 2014;511(7509):344–7. doi: 10.1038/nature13394. - DOI - PubMed
    1. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156–8. doi: 10.1093/bioinformatics/btr330. - DOI - PMC - PubMed
    1. Fejes AP, Khodabakhshi AH, Birol I, Jones SJ. Human variation database: an open-source database template for genomic discovery. Bioinformatics. 2011;27(8):1155–6. doi: 10.1093/bioinformatics/btr100. - DOI - PubMed
    1. Sim NL, Kumar P, Hu J, Henikoff S, Schneider G, Ng PC. SIFT web server: predicting effects of amino acid substitutions on proteins. Nucleic Acids Res. 2012;40(Web Server issue):W452–7. doi: 10.1093/nar/gks539. - DOI - PMC - PubMed

Publication types

Grants and funding

LinkOut - more resources