Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 3;110(8):1343-1355.
doi: 10.1016/j.ajhg.2023.07.007.

Genome sequencing and comprehensive rare-variant analysis of 465 families with neurodevelopmental disorders

Affiliations

Genome sequencing and comprehensive rare-variant analysis of 465 families with neurodevelopmental disorders

Alba Sanchis-Juan et al. Am J Hum Genet. .

Abstract

Despite significant progress in unraveling the genetic causes of neurodevelopmental disorders (NDDs), a substantial proportion of individuals with NDDs remain without a genetic diagnosis after microarray and/or exome sequencing. Here, we aimed to assess the power of short-read genome sequencing (GS), complemented with long-read GS, to identify causal variants in participants with NDD from the National Institute for Health and Care Research (NIHR) BioResource project. Short-read GS was conducted on 692 individuals (489 affected and 203 unaffected relatives) from 465 families. Additionally, long-read GS was performed on five affected individuals who had structural variants (SVs) in technically challenging regions, had complex SVs, or required distal variant phasing. Causal variants were identified in 36% of affected individuals (177/489), and a further 23% (112/489) had a variant of uncertain significance after multiple rounds of re-analysis. Among all reported variants, 88% (333/380) were coding nuclear SNVs or insertions and deletions (indels), and the remainder were SVs, non-coding variants, and mitochondrial variants. Furthermore, long-read GS facilitated the resolution of challenging SVs and invalidated variants of difficult interpretation from short-read GS. This study demonstrates the value of short-read GS, complemented with long-read GS, in investigating the genetic causes of NDDs. GS provides a comprehensive and unbiased method of identifying all types of variants throughout the nuclear and mitochondrial genomes in individuals with NDD.

Keywords: long-read sequencing; neurodevelopmental disorders; structural variants; whole-genome sequencing.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests K.J.C. and K.M. are currently employees of AstraZeneca.

Figures

Figure 1
Figure 1
Factors affecting variant discovery and diagnostic yield (A) Diagnostic yield is affected by the sequenced family structure. Boxes show the number of affected individuals in each class of family structure. Singletons have no sequenced relatives, trios have both parents sequenced, proband parents have one parent sequenced, siblings have one sibling sequenced, and quads have both parents and one sibling sequenced. “Solved” refers to an affected individual with a P or LP variant. “Partially solved” refers to an affected individual with a P or LP variant that only partially explains the phenotype. “VUS” refers to an affected individual with a variant of uncertain significance. “Unsolved” refers to an affected individual with no identified P or LP variants or VUSs. (B) Diagnostic yield is affected by phenotype. Boxes show the number of affected individuals with each phenotype. These numbers overlap because many individuals have more than one phenotype. ASD, autism spectrum disorder; CNS, central nervous system. (C) The proportion of identified variants that are P or LP is affected by the mode of inheritance. Boxes show the number of identified variants in each class. XLR, X-linked recessive; XLD, X-linked dominant; MT, mitochondrial; VUS, variant of uncertain significance; P, pathogenic; LP, likely pathogenic. (D) The number of identified variants that are P or LP is affected by the round of analysis, where new variants were identified in each successive round, demonstrating the value of re-analysis. Boxes show the number of variants identified in each round (cumulative). Round 1 was March 2016 to January 2018, round 2 was July 2018, and round 3 was July 2019.
Figure 2
Figure 2
Complex structural variants resolved by lrGS (A and B) Circular layout plot of the complex rearrangement in (A) participant 6 (NGC00375_01), involving 37 breakpoints between chromosomes 7, 10, and 12, and in (B) participant 7 (G012664), involving 26 duplicated fragments from 14 chromosomes. Both plots were generated with Circos; the outer ring shows the chromosomes (coordinates in mbp), and the inner ring shows the depth coverage of the individual, normalized with 250 unrelated individuals in the cohort. In the scatterplot, deletions are shown in red, and duplications are shown in blue. Breakpoint junction links are shown in black (interchromosomal) and green (intrachromosomal). (C) Variant phasing performed on participant 8 (G013428) demonstrated the absence of an inversion called in the srGS data. The ideogram for chromosome X highlighting the region involved is at the top, below which are the genes present within this region and the inversion coordinates (in green). A zoomed-in panel for both the start (S) and end (E) of the inversion are shown next for srGS and lrGS data. It is noticeable that both are located within LINE-1 retrotransposon repeats (Rep) and are not supported by lrGS data. (D) Variant phasing performed on participant 10 (G000973) facilitated the resolution of a complex event involving a retroelement of KIF5C. The ideogram of chromosome 2 is at the top, below which are the KIF5C transcripts and a zoomed-in region with the srGS calls; deletions are shown in red, inversions are in green, and the duplication is in blue. The following two panels show the coverage (Cov) and IGV visualization of the short reads and the lrGS alignments. Split reads and discordant pairs are present in the srGS data and absent in the lrGS data, consistent with the retroelement insertion.

References

    1. Boycott K.M., Hartley T., Biesecker L.G., Gibbs R.A., Innes A.M., Riess O., Belmont J., Dunwoodie S.L., Jojic N., Lassmann T., et al. A diagnosis for all rare genetic diseases: the horizon and the next frontiers. Cell. 2019;177:32–37. doi: 10.1016/j.cell.2019.02.040. - DOI - PubMed
    1. Deciphering Developmental Disorders Study Prevalence and architecture of de novo mutations in developmental disorders. Nature. 2017;542:433–438. doi: 10.1038/nature21062. - DOI - PMC - PubMed
    1. 100000 Genomes Project Pilot Investigators. Smedley D., Smith K.R., Martin A., Thomas E.A., McDonagh E.M., Cipriani V., Ellingford J.M., Arno G., Tucci A., et al. 100,000 Genomes pilot on rare-disease diagnosis in health care - preliminary report. N. Engl. J. Med. 2021;385:1868–1880. doi: 10.1056/NEJMoa2035790. - DOI - PMC - PubMed
    1. Clark M.M., Stark Z., Farnaes L., Tan T.Y., White S.M., Dimmock D., Kingsmore S.F. Meta-analysis of the diagnostic and clinical utility of genome and exome sequencing and chromosomal microarray in children with suspected genetic diseases. NPJ Genom. Med. 2018;3:16. doi: 10.1038/s41525-018-0053-8. - DOI - PMC - PubMed
    1. Belyeu J.R., Brand H., Wang H., Zhao X., Pedersen B.S., Feusier J., Gupta M., Nicholas T.J., Brown J., Baird L., et al. De novo structural mutation rates and gamete-of-origin biases revealed through genome sequencing of 2,396 families. Am. J. Hum. Genet. 2021;108:597–607. doi: 10.1016/j.ajhg.2021.02.012. - DOI - PMC - PubMed

Publication types

LinkOut - more resources