Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 26;9(1):49.
doi: 10.1038/s41525-024-00436-6.

Comprehensive reanalysis for CNVs in ES data from unsolved rare disease cases results in new diagnoses

Collaborators, Affiliations

Comprehensive reanalysis for CNVs in ES data from unsolved rare disease cases results in new diagnoses

German Demidov et al. NPJ Genom Med. .

Abstract

We report the results of a comprehensive copy number variant (CNV) reanalysis of 9171 exome sequencing datasets from 5757 families affected by a rare disease (RD). The data reanalysed was extremely heterogeneous, having been generated using 28 different enrichment kits by 42 different research groups across Europe partnering in the Solve-RD project. Each research group had previously undertaken their own analysis of the data but failed to identify disease-causing variants. We applied three CNV calling algorithms to maximise sensitivity, and rare CNVs overlapping genes of interest, provided by four partner European Reference Networks, were taken forward for interpretation by clinical experts. This reanalysis has resulted in a molecular diagnosis being provided to 51 families in this sample, with ClinCNV performing the best of the three algorithms. We also identified partially explanatory pathogenic CNVs in a further 34 individuals. This work illustrates the value of reanalysing ES cold cases for CNVs.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Violin plot of the median depth of coverage by kit for 9351 ES experiments pertaining to 28 different enrichment kits.
The number of experiments pertaining to each kit is shown above the plots. Coverage is shown on the Y-axis. Thickness of the plotted shape indicates the proportion of experiments that have a particular coverage.
Fig. 2
Fig. 2. Distribution of lengths of 7849 CNV calls detected in 3436 affected individuals, separated into deletions (Panel a) and duplications (Panel b).
The x-axis represents the length of calls identified (log10 scale), and the y-axis the number of events observed. Note that the y-axis scale is different in panel a from panel b.
Fig. 3
Fig. 3. Heat maps illustrating the length of confirmed disease-causing CNVs (Panel a), partially explanatory disease-causing CNVs (Panel b) and candidate disease-causing CNVs (Panel c) identified in this study.
Duplications are shown in blue, and deletions in red. Cyan and pink, represent duplication and deletion calls, respectively, which were initially QC filtered in the workflow for the respective tool, and identified post hoc. The approximate length of the event is indicated in the top layer using a log10 scale. The affected gene is indicated along the bottom. Where more than one gene was unaffected, it is shown as multiple, with the affected chromosome indicated.
Fig. 4
Fig. 4. Family pedigree and MLPA confirmation results for a Mexican family extensively affected by Hereditary Gastric Cancer.
a Family tree of the family of proband P0014615 (represented by an arrow). Exome Sequencing data from six individuals of the family was submitted to Solve-RD for re-analysis, following prior analysis in 2015 for both SNVs and CNVs, which did not identify any variants of interest. Three of the sequenced family members were affected by diffuse gastric cancer (DGC, black symbols: P0014616, P0014615, P0014613), while the other three were unaffected (P0014617, P0014614, P0014612). Individual III-3 (P0014617) is currently a healthy carrier, perhaps due to incomplete penetrance previously reported for CHD1. The age shown below affected individuals indicates the age of disease onset, while that below healthy individuals represents their current age. b MLPA validation results using SALSA MLPA-Probemix P083 CDH1 (MRC Holland) in the healthy-carrier III-3, and in the proband, III-5. A ratio above the blue line indicates an elevated number of copies, while a ratio below the red line indicates a decrease in copy number. The shaded blue area represents the position of probes for CDH1 and two neighbouring genes, while the grey area represents reference probes.
Fig. 5
Fig. 5. IGV screenshots correspond to the four illustrative newly diagnosed individuals described in the main text, one from each ERN.
a RND: Heterozygous deletion spanning NAA15, in an individual with intellectual disability, which was found to be inherited from her paucisymptomatic mother. b EURO-NMD: Hemizygous deletion of exons 45-47 of DMD resulting in Becker Muscular Dystrophy. c ITHACA: Heterozygous de novo deletion spanning CSNK2B, resulting in POBINDS. d GENTURIS: Inherited heterozygous deletion affecting CDH1 and TANGO6, resulting in autosomal dominant HDGC. Images show customised coverage tracks and the position of the identified CNV (red bar). Blue dots above the midline indicate elevated coverage, while red dots below the line indicate reduced coverage. The position of genes is indicated at the bottom of the image, while the chromosomal position is indicated at the top of the image.
Fig. 6
Fig. 6. Workflow used for CNV calling, filtering, and annotation prior to returning calls to clinical experts for interpretation.
The first line shows the pre-processing generation of coverage profiles for each experiment, prior to these profiles being passed to the 3 algorithms for CNV calling. The third line indicates the collation of CNVs of diferent types which were then annotatd and filtered appropriately before being passed to the respective ERNs for prioritisation.

References

    1. Nguengang Wakap, S. et al. Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database. Eur. J. Hum. Genet.28, 165–173 (2020). - PMC - PubMed
    1. European Commission. EU Research on Rare Diseaseshttps://research-and-innovation.ec.europa.eu/research-area/health/rare-d... (2023).
    1. Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv10.1101/201178 (2018).
    1. Krumm, N. et al. Copy number variation detection and genotyping from exome sequence data. Genome Res.22, 1525–1532 (2012). - PMC - PubMed
    1. Li, J. et al. CONTRA: copy number analysis for targeted resequencing. Bioinformatics28, 1307–1313 (2012). - PMC - PubMed

LinkOut - more resources