Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 27;14(1):122.
doi: 10.1186/s13073-022-01123-w.

The multiple de novo copy number variant (MdnCNV) phenomenon presents with peri-zygotic DNA mutational signatures and multilocus pathogenic variation

Affiliations

The multiple de novo copy number variant (MdnCNV) phenomenon presents with peri-zygotic DNA mutational signatures and multilocus pathogenic variation

Haowei Du et al. Genome Med. .

Abstract

Background: The multiple de novo copy number variant (MdnCNV) phenotype is described by having four or more constitutional de novo CNVs (dnCNVs) arising independently throughout the human genome within one generation. It is a rare peri-zygotic mutational event, previously reported to be seen once in every 12,000 individuals referred for genome-wide chromosomal microarray analysis due to congenital abnormalities. These rare families provide a unique opportunity to understand the genetic factors of peri-zygotic genome instability and the impact of dnCNV on human diseases.

Methods: Chromosomal microarray analysis (CMA), array-based comparative genomic hybridization, short- and long-read genome sequencing (GS) were performed on the newly identified MdnCNV family to identify de novo mutations including dnCNVs, de novo single-nucleotide variants (dnSNVs), and indels. Short-read GS was performed on four previously published MdnCNV families for dnSNV analysis. Trio-based rare variant analysis was performed on the newly identified individual and four previously published MdnCNV families to identify potential genetic etiologies contributing to the peri-zygotic genomic instability. Lin semantic similarity scores informed quantitative human phenotype ontology analysis on three MdnCNV families to identify gene(s) driving or contributing to the clinical phenotype.

Results: In the newly identified MdnCNV case, we revealed eight de novo tandem duplications, each ~ 1 Mb, with microhomology at 6/8 breakpoint junctions. Enrichment of de novo single-nucleotide variants (SNV; 6/79) and de novo indels (1/12) was found within 4 Mb of the dnCNV genomic regions. An elevated post-zygotic SNV mutation rate was observed in MdnCNV families. Maternal rare variant analyses identified three genes in distinct families that may contribute to the MdnCNV phenomenon. Phenotype analysis suggests that gene(s) within dnCNV regions contribute to the observed proband phenotype in 3/3 cases. CNVs in two cases, a contiguous gene duplication encompassing PMP22 and RAI1 and another duplication affecting NSD1 and SMARCC2, contribute to the clinically observed phenotypic manifestations.

Conclusions: Characteristic features of dnCNVs reported here are consistent with a microhomology-mediated break-induced replication (MMBIR)-driven mechanism during the peri-zygotic period. Maternal genetic variants in DNA repair genes potentially contribute to peri-zygotic genomic instability. Variable phenotypic features were observed across a cohort of three MdnCNV probands, and computational quantitative phenotyping revealed that two out of three had evidence for the contribution of more than one genetic locus to the proband's phenotype supporting the hypothesis of de novo multilocus pathogenic variation (MPV) in those families.

Keywords: De novo CNV; De novo SNV, Human Phenotype Ontology, Structural variation; Genomic data integration, Genomic data visualization, MMBIR; Genomic instability; Long-read sequencing; Tandem duplication.

PubMed Disclaimer

Conflict of interest statement

Baylor College of Medicine (BCM) and Miraca Holdings have formed a joint venture with shared ownership and governance of Baylor Genetics (BG), which performs clinical chromosome microarray analysis (CMA) and other genomic studies (ES, genome sequencing) for patient/family care. J.R.L. serves on the Scientific Advisory Board of BG. J.R.L. has stock ownership in 23andMe, is a paid consultant for the Regeneron Genetics Center, and is a co-inventor on multiple United States and European patents related to molecular diagnostics for inherited neuropathies, eye diseases, genomic disorders, and bacterial genomic fingerprinting. PL and WB are employees of BCM and derive support through a professional service agreement with BG. MP, EH, and SJ are employees of Oxford Nanopore Technologies and are shareholders and/or share option holders. FJS has multiple travels sponsored by Pacbio and ONT. The remaining authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
dnCNV and dnSNV identified with multiple genomic approaches. a Pedigree (left) of the MdnCNV family HOU3579. In the middle, the sequencing platform and variant calling pipeline are illustrated. Shown on the right, from top to bottom, is  the visualization of an example of dnCNV in CMA, 1 M aCGH, short-read genome sequencing read depth, short-read genome sequencing B-allele frequency, and IGV view a high-quality dnSNV call. b Log2 ratio of phased dnCNV in genome-wide view with chromosomes along the x-axis. Gains present on chromosomes 4, 6, 12, and 14 are each indicated with a green dot representing duplication on the paternally inherited chromosome. Gains present on chromosomes 5, 10, 13, and 21 are each indicated with a pink dot representing duplication on the maternally inherited chromosome. The text adjacent to each dot denotes the size (in Mb) of each dnCNV. c Pedigree of MdnCNV family (top) with aCGH result for each dnCNV region. Parental origin of each chromosome harboring a dnCNV in the proband is indicated by a “P” (paternal) or “M” (maternal) on each array
Fig. 2
Fig. 2
De novo variants detected in BAB9637. a Ratio of transition to transversion is shown at the top of the bar graph. The bar graph represents the relative contribution of types of SNV. b Horizontal red bars represent each dnCNV that is associated with DNMs in proximity to the breakpoint. All seven DNMs found within 4 Mb of the breakpoints are highlighted with a star at the relative location. Maternal and paternal DNMs are highlighted in pink and green, respectively. The DNM at chr4, chr6, chr12, chr10, and chr14 are in cis with dnCNVs. c Sanger traces are visualized (proband, mother, and father) for DNMs. * The variant at the breakpoint junction. d Density of unphased DNM corresponding to the window size highlighting continuous drop-off of observed DNM after 4 Mb
Fig. 3
Fig. 3
The number and mutational pattern of pre-zygotic and post-zygotic de novo mutations in MdnCNV families (blue) versus controls (orange). a The VAF distribution of de novo substitutions in MdnCNV (blue) and control (orange) families. b The proportion of DNMs that are predicted to be post-zygotic mutations (dark orange/blue). c The number of pre-zygotic (germline) mutations is positively correlated with paternal age. The gray area denotes the region covered by the 95% confidence interval of the slope and intercept of the linear regression lines. d The number of post-zygotic mutations shows no correlation with age
Fig. 4
Fig. 4
Maternal variants potentially contributing to genome instability. a–c MdnCNV pedigrees with identified rare VUS maternal variants affecting genes involved in DNA repair or replicaion. d Bar plot shows the contribution of SBS signatures refitted by on genome-wide dnSNV. Predicted protein structure plots show the amino acid change in proximity to previously reported variants in protein ERCC4 (e) and MSH3 (f). Molecular modeling images were acquired from Varsite [33], with pathogenetic variants from ClinVar mapped. The amino acid residues in red reveal the change caused by variants reported here and the ones in purple or gray reveal the reported pathogenetic variants from ClinVar
Fig. 5
Fig. 5
Phenotype similarity score analysis for disease-associated genes and potential gene combinations for the multiple pathogenic variant case (BAB9637). a Heatmap representing color-coded Lin semantic similarity scores of BAB9637 and database annotated phenotypes included. Both rows and columns are clustered using pairwise similarity scores and the Ward's method. The dendrogram is present at the top and to the left of the heatmap. Colored columns are depicted at the bottom and to the right and annotate variant type and affected gene as defined at the bottom. b Distribution of similarity scores in each known disease-associated gene group with n = 1 to n = 5 genes. A black line connects the max score of subsequent subsets of groups, e.g., max score of groups with one gene to max score of groups with two genes. c Annotation grid demonstrates individual reported NSD1 duplication proband phenotypes, individual SMARCC2 LoF proband phenotypes, NSD1 DUP and SMARCC2-associated clinical phenotypes summary, and proband phenotypes for BAB9637. From left to right aligned HPO phenotype, blue squares indicate the presence of the phenotype, i.e., HPO term, while gray represents the absence of the term. The clinical phenotype summary was based on the cases used in HPO analysis, with the degree of shading indicating the percent of reported cases shown here for which a particular feature has been observed, as defined in the legend in the top right corner

References

    1. Collins RL, Brand H, Karczewski KJ, Zhao X, Alföldi J, Francioli LC, et al. A structural variation reference for medical and population genetics. Nature. 2020;581:444–451. - PMC - PubMed
    1. Belyeu JR, Brand H, Wang H, Zhao X, Pedersen BS, Feusier J, et al. De novo structural mutation rates and gamete-of-origin biases revealed through genome sequencing of 2,396 families. Am J Hum Genet. 2021;108:597–607. - PMC - PubMed
    1. Itsara A, Wu H, Smith JD, Nickerson DA, Romieu I, London SJ, et al. De novo rates and selection of large copy number variation. Genome Res. 2010;20:1469–81. - PMC - PubMed
    1. Liu P, Yuan B, Carvalho CMB, Wuster A, Walter K, Zhang L, et al. An organismal CNV mutator phenotype restricted to early human development. Cell. 2017;168:830–842.e7. - PMC - PubMed
    1. Köhler S, Gargano M, Matentzoglu N, Carmody LC, Lewis-Smith D, Vasilevsky NA, et al. The human phenotype ontology in 2021. Nucleic Acids Res. 2021;49:D1207–D1217. - PMC - PubMed

Publication types