Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Oct 5:2023.10.05.23296595.
doi: 10.1101/2023.10.05.23296595.

Exome copy number variant detection, analysis and classification in a large cohort of families with undiagnosed rare genetic disease

Gabrielle Lemire  1   2   3   4   5   6 Alba Sanchis-Juan  1   2   4   5   6 Kathryn Russell  1   2 Samantha Baxter  1   2 Katherine R Chao  1   2   5 Moriel Singer-Berk  1   2   5 Emily Groopman  1   2   3 Isaac Wong  2   5 Eleina England  1   2 Julia Goodrich  1   2   5 Lynn Pais  1   2   3   5 Christina Austin-Tse  1   2   5 Stephanie DiTroia  1   2   3   5 Emily O'Heir  1   2   5 Vijay S Ganesh  1   2   3   4   5   7 Monica H Wojcik  1   2   3   4 Emily Evangelista  1   2 Hana Snow  1   2 Ikeoluwa Osei-Owusu  1   2   5 Jack Fu  2   4   5 Mugdha Singh  1   2   3   4   5 Yulia Mostovoy  1   2   5 Steve Huang  2 Kiran Garimella  2 Samantha L Kirkham  3 Jennifer E Neil  3   8 Diane D Shao  3   4   9 Christopher A Walsh  2   3   4   8 Emanuela Argili  10   11 Carolyn Le  10   11 Elliott H Sherr  10   11 Joseph Gleeson  12   13 Shirlee Shril  4   14 Ronen Schneider  4   14 Friedhelm Hildebrandt  4   14 Vijay G Sankaran  2   4   15 Jill A Madden  3   16 Casie A Genetti  3   16 Alan H Beggs  2   3   4   16 Pankaj B Agrawal  2   3   4   16 Kinga M Bujakowska  2   4   17 Emily Place  2   4   17 Eric A Pierce  2   4   17 Sandra Donkervoort  18 Carsten G Bönnemann  18 Lyndon Gallacher  19   20 Zornitza Stark  19   20 Tiong Tan  19   20 Susan M White  19   20 Ana Töpf  21 Volker Straub  21 Mark D Fleming  4   22 Martin R Pollak  4   23 Katrin Õunap  24   25 Sander Pajusalu  24   25 Kirsten A Donald  26   27 Zandre Bruwer  26   27 Gianina Ravenscroft  28 Nigel G Laing  28 Daniel G MacArthur  1   2   29   30 Heidi L Rehm  1   2   4   5 Michael E Talkowski  1   2   4   5 Harrison Brand  1   2   4   5   31 Anne O'Donnell-Luria  1   2   3   4   31
Affiliations

Exome copy number variant detection, analysis and classification in a large cohort of families with undiagnosed rare genetic disease

Gabrielle Lemire et al. medRxiv. .

Update in

  • Exome copy number variant detection, analysis, and classification in a large cohort of families with undiagnosed rare genetic disease.
    Lemire G, Sanchis-Juan A, Russell K, Baxter S, Chao KR, Singer-Berk M, Groopman E, Wong I, England E, Goodrich J, Pais L, Austin-Tse C, DiTroia S, O'Heir E, Ganesh VS, Wojcik MH, Evangelista E, Snow H, Osei-Owusu I, Fu J, Singh M, Mostovoy Y, Huang S, Garimella K, Kirkham SL, Neil JE, Shao DD, Walsh CA, Argilli E, Le C, Sherr EH, Gleeson JG, Shril S, Schneider R, Hildebrandt F, Sankaran VG, Madden JA, Genetti CA, Beggs AH, Agrawal PB, Bujakowska KM, Place E, Pierce EA, Donkervoort S, Bönnemann CG, Gallacher L, Stark Z, Tan TY, White SM, Töpf A, Straub V, Fleming MD, Pollak MR, Õunap K, Pajusalu S, Donald KA, Bruwer Z, Ravenscroft G, Laing NG, MacArthur DG, Rehm HL, Talkowski ME, Brand H, O'Donnell-Luria A. Lemire G, et al. Am J Hum Genet. 2024 May 2;111(5):863-876. doi: 10.1016/j.ajhg.2024.03.008. Epub 2024 Apr 1. Am J Hum Genet. 2024. PMID: 38565148 Free PMC article.

Abstract

Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and with new innovative methods can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of the GREGoR consortium. Each family's CNV data was analyzed using the seqr platform and candidate CNVs classified using the 2020 ACMG/ClinGen CNV interpretation standards. We developed additional evidence criteria to address situations not covered by the current standards. The addition of CNV calling to exome analysis identified causal CNVs for 173 families (2.6%). The estimated sizes of CNVs ranged from 293 bp to 80 Mb with estimates that 44% would not have been detected by standard chromosomal microarrays. The causal CNVs consisted of 141 deletions, 15 duplications, 4 suspected complex structural variants (SVs), 3 insertions and 10 complex SVs, the latter two groups being identified by orthogonal validation methods. We interpreted 153 CNVs as likely pathogenic/pathogenic and 20 CNVs as high interest variants of uncertain significance. Calling CNVs from existing exome data increases the diagnostic yield for individuals undiagnosed after standard testing approaches, providing a higher resolution alternative to arrays at a fraction of the cost of genome sequencing. Our improvements to the classification approach advances the systematic framework to assess the pathogenicity of CNVs.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests H.L.R. has received support from Illumina and Microsoft to support rare disease gene discovery and diagnosis. A.O-D.L. has consulted for Tome Biosciences and Ono Pharma USA Inc. D.G.M is a paid advisor to GlaxoSmithKline, Insitro, Variant Bio and Overtone Therapeutics, and has received research support from AbbVie, Astellas, Biogen, BioMarin, Eisai, Google, Merck, Microsoft, Pfizer, and Sanofi-Genzyme. C.A.W. is a paid advisor to Maze Therapeutics. M.E.T. receives research funding from Microsoft Inc, Illumina Inc and Levo Therapeutics. The remaining authors declare no competing interests.

Figures

Figure 1.
Figure 1.. Exome copy number plot and reads visualization for examples of causal CNVs in the Broad CMG cohort.
(A) Individual affected with retinitis pigmentosa with a homozygous single exon deletion in CRB1 (chr1:197438450–197439442×0, Quality score (QS) = 120) identified on exome. To evaluate the quality of the CNV, the patient’s copy number (CN) level (in red) was compared to a cluster of other samples with similar read depth that act as controls. The proband’s CN is decreased compared to the background cluster, compatible with a homozygous deletion. Y axis: CN. (B) As breakpoints fell within the exome data, manual inspection of read data from the individual from (A) using the Integrated Genomics Viewer (IGV) showed discordant read pairs, split reads and complete absence of coverage, compatible with a homozygous exon 10 deletion also including part of upstream exon 9 in CRB1 (chr1:197435257–197441674×0 (NM_201253.3)). Cov= coverage. (C) Individual with multiple congenital anomalies and a heterozygous deletion of 4 exons in RAB3GAP1 (Warburg micro syndrome) (red, chr2:135162318–135164794×1, QS =92) in trans with a frameshift variant in RAB3GAP1 (not shown, NM_012233.3: c.2393_2394del, p.Leu798ArgfsTer7), both identified by exome. The presence of the deletion was validated by droplet digital PCR. Y axis: CN. (D) Individual with a neurodevelopmental disorder with a de novo 2.6 Mb heterozygous 1q43q44 deletion (red, chr1:242523991–245156781×1, QS =3077) identified on exome. The presence of this deletion was validated by quantitative PCR. Y axis: CN. (E) Individual with a neurodevelopmental disorder with a de novo 2.1Mb 22q11.2 duplication (red, chr22:1898573921081116×3, QS =3077) identified on exome. The presence of this duplication was validated by chromosomal microarray. Y axis: CN. All coordinates on GRCh38.
Figure 2.
Figure 2.. Flowchart illustrating how points were scored for CNVs that followed a X-linked inheritance.
We incorporated sex of proband, parental genotype and parental affected status to score both the proband in which the X-linked variant was identified and, if applicable, any individual in the published literature or public databases that had variants of similar genomic content to the variant of interest. The points for each case could be increased or decreased based on phenotype specificity, up to 0.45 points.
Figure 3.
Figure 3.. Characteristics of CNVs across the entire callset and the subset of causal CNVs.
(A) Number of high-confidence CNVs by estimated size that were identified in the Broad CMG exome callset of 6,633 families sequenced between 2016 and 2021. Large CNVs tend to be fragmented into multiple small GATK-gCNV calls, accounting for why there are no CNVs in the >10 Mb category of the graph. These CNVs were interpreted as being part of the same underlying event when looking at the copy number plot and/or validation methods and are presented that way in Figure 3B and 3C. DEL: deletion; DUP: duplication. (B) Mode of inheritance and number of genes involved in each CNV in 173 families in which the CNV was interpreted as causal. The number of genes included in each interval was chosen based on cutoffs suggested for CNV scoring in section 3 of the Riggs et al. ACMG/ClinGen standards. (C) CNV classification by estimated size in 173 families in which the CNV was interpreted as causal by the multidisciplinary team. The causal CNVs consisted of 141 deletions, 15 duplications, 3 insertions (miscalled as deletion by GATK-gCNV), and 14 complex structural variants (SV). We interpreted 153 CNVs as likely pathogenic/pathogenic and 20 CNVs as VUS.

Similar articles

  • Exome copy number variant detection, analysis, and classification in a large cohort of families with undiagnosed rare genetic disease.
    Lemire G, Sanchis-Juan A, Russell K, Baxter S, Chao KR, Singer-Berk M, Groopman E, Wong I, England E, Goodrich J, Pais L, Austin-Tse C, DiTroia S, O'Heir E, Ganesh VS, Wojcik MH, Evangelista E, Snow H, Osei-Owusu I, Fu J, Singh M, Mostovoy Y, Huang S, Garimella K, Kirkham SL, Neil JE, Shao DD, Walsh CA, Argilli E, Le C, Sherr EH, Gleeson JG, Shril S, Schneider R, Hildebrandt F, Sankaran VG, Madden JA, Genetti CA, Beggs AH, Agrawal PB, Bujakowska KM, Place E, Pierce EA, Donkervoort S, Bönnemann CG, Gallacher L, Stark Z, Tan TY, White SM, Töpf A, Straub V, Fleming MD, Pollak MR, Õunap K, Pajusalu S, Donald KA, Bruwer Z, Ravenscroft G, Laing NG, MacArthur DG, Rehm HL, Talkowski ME, Brand H, O'Donnell-Luria A. Lemire G, et al. Am J Hum Genet. 2024 May 2;111(5):863-876. doi: 10.1016/j.ajhg.2024.03.008. Epub 2024 Apr 1. Am J Hum Genet. 2024. PMID: 38565148 Free PMC article.
  • GATK-gCNV enables the discovery of rare copy number variants from exome sequencing data.
    Babadi M, Fu JM, Lee SK, Smirnov AN, Gauthier LD, Walker M, Benjamin DI, Zhao X, Karczewski KJ, Wong I, Collins RL, Sanchis-Juan A, Brand H, Banks E, Talkowski ME. Babadi M, et al. Nat Genet. 2023 Sep;55(9):1589-1597. doi: 10.1038/s41588-023-01449-0. Epub 2023 Aug 21. Nat Genet. 2023. PMID: 37604963 Free PMC article.
  • Clinically relevant copy-number variants in exome sequencing data of patients with dystonia.
    Zech M, Boesch S, Škorvánek M, Necpál J, Švantnerová J, Wagner M, Dincer Y, Sadr-Nabavi A, Serranová T, Rektorová I, Havránková P, Ganai S, Mosejová A, Příhodová I, Šarláková J, Kulcsarová K, Ulmanová O, Bechyně K, Ostrozovičová M, Haň V, Ventosa JR, Shariati M, Shoeibi A, Weber S, Mollenhauer B, Trenkwalder C, Berutti R, Strom TM, Ceballos-Baumann A, Mall V, Haslinger B, Jech R, Winkelmann J. Zech M, et al. Parkinsonism Relat Disord. 2021 Mar;84:129-134. doi: 10.1016/j.parkreldis.2021.02.013. Epub 2021 Feb 12. Parkinsonism Relat Disord. 2021. PMID: 33611074
  • Incorporating CNV analysis improves the yield of exome sequencing for rare monogenic disorders-an important consideration for resource-constrained settings.
    Louw N, Carstens N, Lombard Z; for DDD-Africa as members of the H3Africa Consortium. Louw N, et al. Front Genet. 2023 Dec 14;14:1277784. doi: 10.3389/fgene.2023.1277784. eCollection 2023. Front Genet. 2023. PMID: 38155715 Free PMC article. Review.
  • inCNV: An Integrated Analysis Tool for Copy Number Variation on Whole Exome Sequencing.
    Chanwigoon S, Piwluang S, Wichadakul D. Chanwigoon S, et al. Evol Bioinform Online. 2020 Sep 24;16:1176934320956577. doi: 10.1177/1176934320956577. eCollection 2020. Evol Bioinform Online. 2020. PMID: 33029071 Free PMC article. Review.

References

    1. Alkan C., Coe B.P., and Eichler E.E. (2011). Genome structural variation discovery and genotyping. Nat. Rev. Genet. 12, 363–376. - PMC - PubMed
    1. Zarrei M., MacDonald J.R., Merico D., and Scherer S.W. (2015). A copy number variation map of the human genome. Nat. Rev. Genet. 16, 172–183. - PubMed
    1. Zhang F., Gu W., Hurles M.E., and Lupski J.R. (2009). Copy number variation in human health, disease, and evolution. Annu. Rev. Genomics Hum. Genet. 10, 451–481. - PMC - PubMed
    1. Weischenfeldt J., Symmons O., Spitz F., and Korbel J.O. (2013). Phenotypic impact of genomic structural variation: insights from and for human disease. Nat. Rev. Genet. 14, 125–138. - PubMed
    1. Manning M., Hudgins L., and Professional Practice and Guidelines Committee (2010). Array-based technology and recommendations for utilization in medical genetics practice for detection of chromosomal abnormalities. Genet. Med. 12, 742–745. - PMC - PubMed

Publication types