Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 9;15(1):94.
doi: 10.1186/s13073-023-01240-0.

Structural and non-coding variants increase the diagnostic yield of clinical whole genome sequencing for rare diseases

Alistair T Pagnamenta #  1   2 Carme Camps #  1   2 Edoardo Giacopuzzi #  1   2   3 John M Taylor #  2   4 Mona Hashim  1   2 Eduardo Calpena  2   5 Pamela J Kaisaki  1   2 Akiko Hashimoto  5 Jing Yu  1   2 Edward Sanders  5 Ron Schwessinger  5 Jim R Hughes  5 Gerton Lunter  5   6 Helene Dreau  2   7 Matteo Ferla  1   2 Lukas Lange  1   2 Yesim Kesim  1   2 Vassilis Ragoussis  1   2 Dimitrios V Vavoulis  1   2   7 Holger Allroggen  8 Olaf Ansorge  9 Christian Babbs  5 Siddharth Banka  10   11 Benito Baños-Piñero  4 David Beeson  5   9 Tal Ben-Ami  12 David L Bennett  9 Celeste Bento  13 Edward Blair  2   14 Charlotte Brasch-Andersen  15 Katherine R Bull  1   16 Holger Cario  17 Deirdre Cilliers  14 Valerio Conti  18 E Graham Davies  19 Fatima Dhalla  20 Beatriz Diez Dacal  4 Yin Dong  5   9 James E Dunford  21 Renzo Guerrini  18 Adrian L Harris  22 Jane Hartley  23 Georg Hollander  24 Kassim Javaid  21 Maureen Kane  25 Deirdre Kelly  23 Dominic Kelly  26 Samantha J L Knight  1   2 Alexandra Y Kreins  19 Erika M Kvikstad  1   2 Craig B Langman  27 Tracy Lester  4 Kate E Lines  2   28 Simon R Lord  29 Xin Lu  30 Sahar Mansour  31 Adnan Manzur  32 Reza Maroofian  33 Brian Marsden  34 Joanne Mason  35 Simon J McGowan  5 Davide Mei  18 Hana Mlcochova  5 Yoshiko Murakami  36 Andrea H Németh  9   14 Steven Okoli  37 Elizabeth Ormondroyd  2   38 Lilian Bomme Ousager  15 Jacqueline Palace  9 Smita Y Patel  39 Melissa M Pentony  1   2 Chris Pugh  16 Aboulfazl Rad  40 Archana Ramesh  1   9 Simone G Riva  5 Irene Roberts  5   24 Noémi Roy  41 Outi Salminen  2   7 Kyleen D Schilling  42 Caroline Scott  5 Arjune Sen  9 Conrad Smith  4 Mark Stevenson  28 Rajesh V Thakker  28 Stephen R F Twigg  5 Holm H Uhlig  2   24   43 Richard van Wijk  44 Barbara Vona  40   45   46 Steven Wall  47 Jing Wang  9 Hugh Watkins  2   38 Jaroslav Zak  30   48 Anna H Schuh  7 Usha Kini  2   14 Andrew O M Wilkie  2   5 Niko Popitsch  1   2   49 Jenny C Taylor  50   51
Affiliations

Structural and non-coding variants increase the diagnostic yield of clinical whole genome sequencing for rare diseases

Alistair T Pagnamenta et al. Genome Med. .

Abstract

Background: Whole genome sequencing is increasingly being used for the diagnosis of patients with rare diseases. However, the diagnostic yields of many studies, particularly those conducted in a healthcare setting, are often disappointingly low, at 25-30%. This is in part because although entire genomes are sequenced, analysis is often confined to in silico gene panels or coding regions of the genome.

Methods: We undertook WGS on a cohort of 122 unrelated rare disease patients and their relatives (300 genomes) who had been pre-screened by gene panels or arrays. Patients were recruited from a broad spectrum of clinical specialties. We applied a bioinformatics pipeline that would allow comprehensive analysis of all variant types. We combined established bioinformatics tools for phenotypic and genomic analysis with our novel algorithms (SVRare, ALTSPLICE and GREEN-DB) to detect and annotate structural, splice site and non-coding variants.

Results: Our diagnostic yield was 43/122 cases (35%), although 47/122 cases (39%) were considered solved when considering novel candidate genes with supporting functional data into account. Structural, splice site and deep intronic variants contributed to 20/47 (43%) of our solved cases. Five genes that are novel, or were novel at the time of discovery, were identified, whilst a further three genes are putative novel disease genes with evidence of causality. We identified variants of uncertain significance in a further fourteen candidate genes. The phenotypic spectrum associated with RMND1 was expanded to include polymicrogyria. Two patients with secondary findings in FBN1 and KCNQ1 were confirmed to have previously unidentified Marfan and long QT syndromes, respectively, and were referred for further clinical interventions. Clinical diagnoses were changed in six patients and treatment adjustments made for eight individuals, which for five patients was considered life-saving.

Conclusions: Genome sequencing is increasingly being considered as a first-line genetic test in routine clinical settings and can make a substantial contribution to rapidly identifying a causal aetiology for many patients, shortening their diagnostic odyssey. We have demonstrated that structural, splice site and intronic variants make a significant contribution to diagnostic yield and that comprehensive analysis of the entire genome is essential to maximise the value of clinical genome sequencing.

Keywords: Bioinformatics pipeline development; Clinical impact; Diagnostic yield; Genome sequencing; Non-coding; Pipeline optimisation; Rare diseases; Splice site variant; Structural variant.

PubMed Disclaimer

Conflict of interest statement

JRH is a founder, director, paid consultant and shareholder of Nucleome Therapeutics.

SL reports consulting fees from Sanofi, GLG consulting, Atheneum and Rejuversen. He has received payment or honoraria for lectures, presentations, or educational events from Eisai, Prosigna, Roche, Pfizer, Novartis, Shionogi and Sanofi and was previously employed by Pfizer. He has received travel, accommodation or expenses from Pfizer, Roche, Synthon and Piqur Therapeutics and research funding from CRUK, Against Breast Cancer, Pathios Therapeutics and is cofounder of Mitox Therapeutics. His institution has received funding for clinical trials for which he is chief investigator or principle investigator from CRUK, Boehringer Ingelheim, Piqur Therapeutics, Astra Zeneca, Carrick Therapeutics, Sanofi, Merck KGaA, Synthon, Roche and Prostate Cancer UK.

GL is a founder and shareholder of Genomics PLC.

JP acknowledges the following: support for scientific meetings and honorariums for advisory work from Merck Serono, Novartis, Chugai, Alexion, Roche, Medimmune, Argenx, UCB, Mitsubishi, Amplo, Janssen, Sanofi; grants from Alexion, Roche, Medimmune, UCB, Amplo biotechnology, Argenx; patent ref P37347WO and licence agreement Numares multimarker MS diagnostics; shares in AstraZeneca; partial funding by highly specialised services NHS England. For CMS itself, support for advisory work and research grants from Argenx and Amplo biotechnology. None are a conflict for this work.

HHU has received research support or consultancy fees from Janssen, UCB Pharma, Eli Lilly, Bristol Myers Squibb BMS, OMass, Mestag, Mirobio, AbbVie and GSK.

Figures

Fig. 1
Fig. 1
Overview of OxClinWGS study: workflow, clinical cases and results. A Case selection and referral was mainly done by the Oxford Genomic Medicine Multidisciplinary Team (GM-MDT); a detailed description of this process is provided elsewhere [37]. Selected samples were whole-genome sequenced and analysed by a clinical (yellow) as well as a research pipeline (green). Identified pathogenic candidate variants were validated and reported back to referring clinicians. Unsolved cases were iteratively investigated by a research pipeline incorporating the latest methods for in silico analysis of WGS data. Resulting novel disease candidate variants were regularly discussed by an interdisciplinary expert team and either rejected or forwarded to (functional) validation. B Core statistics of considered pedigrees (n = 122) and individuals (n = 300). Note that some families had more than one affected individual. Criteria for the shown classification of pathogenicity are discussed in the main text. C Considered disease categories, coloured by case status. D Variant types for considered solved cases (including pathogenic, likely pathogenic and cases with evidence of causality, see main text). Small only: all causative variants are SNVs or small INDELs, SSV: at least one causative variant is a splice site variant, SV: at least one causative variant is a large structural variant, Intronic: at least one causative variant is (deep) intronic. E Classification of cases using a previously introduced schema published in [23]: A: variant in novel gene for phenotype with additional (genetic) evidence; B: novel (mechanism) for phenotype; C: known gene for phenotype; D: variant in novel gene for phenotype, further genetic and functional validation studies in progress; SF: secondary finding. Note that two cases have candidate variants in two categories. A detailed description of the categories is provided in Additional file 3: Table S3. Abbreviations: SV, structural variant; SSV, splice site variant
Fig. 2
Fig. 2
Overview of the OxClinWGS Study: genetic and clinical results. The OxClinWGS RD cohort included 122 cases, of which 47 were considered solved and a further 12 cases had variants of uncertain significance in lead candidates identified. Two cases had secondary findings. Eight novel disease genes have been identified to date, five of which are confirmed disease genes and three of which have evidence of causality. The asterisk denotes that this group includes novel and putative novel genes. The phenotype for one gene was expanded. Revised clinical diagnoses were provided for six patients, whilst for eight patients, the findings led to changes in their clinical management. Colours denote cases with genes that are considered solved (green), have evidence of causality (light green) or are variants of uncertain significance in lead candidates (brown). Abbreviations: PAPA syndrome (pyogenic sterile arthritis, pyoderma gangrenosum, and acne); CNS (central nervous system)
Fig. 3
Fig. 3
Validation data for two patients with X-linked structural variants in OxClinWGS study. A–C Patient with ARX deletion: A screenshot showing 125 bp read alignments supporting a de novo deletion of ARX exon 1. Region shown is chrX:25,003,000–25,019,000 (GRCh38). Visualisation is using IGV v2.11.2, with squished and “view as pairs” options. Alignments are coloured by insert size and transcript shown is NM_139058.3. B UCSC genome browser session showing the position of the deletion in relation to the PCR primers and MLPA probes that were used for validation. Also shown is the GC content which rises to > 80% near the distal breakpoint where the coverage drops also in parental genomes. An interactive version is available at https://genome.ucsc.edu/s/AlistairP/ARX_deletion_v6. C The 3 kb deletion was confirmed by the MLPA validation data visualised using coffalyser and shows a drop in signal only for the proband for the exon 1 probe (red arrow). The grey boxes are reference probes and the orange boxes highlight the 95% confidence range of the reference samples used. D–H Complex rearrangement in patient with X-linked neurodevelopmental disorder. D Read count information from short-read sequencing normalised by ngCGH software (https://github.com/seandavi/ngCGH) showing two X chromosome duplications (red arrows). E Split Illumina read-pairs suggest the two duplications are inter-linked. However, two possible configurations can explain the split read pattern. F Circos plot highlights the only SV identified by the Bionano pipeline above the threshold for SV detection. G Genome browser view of the optical maps robustly detects the ~ 600 kb duplication of Xp22.11p21.3 being inserted into Xq27.1, which is present in the carrier mother and both affected male siblings using the Bionano pipeline excluding Complex Multi-Path Regions (CMPR). The red box highlights the duplication inserted into Xq27.1. H The Bionano pipeline without masking CMPR detects ~ 102 kb tandem duplication (red boxes) flanking either side of the 600 kb insertion from Xp22.11-Xp21.3 (blue box), therefore, supporting conformation 1 as suggested in E

References

    1. Dawkins HJS, et al. Progress in rare diseases research 2010–2016: an IRDiRC perspective. Clin Transl Sci. 2018;11(1):11–20. doi: 10.1111/cts.12501. - DOI - PMC - PubMed
    1. Lionel AC, et al. Improved diagnostic yield compared with targeted gene sequencing panels suggests a role for whole-genome sequencing as a first-tier genetic test. Genet Med. 2018;20(4):435–443. doi: 10.1038/gim.2017.119. - DOI - PMC - PubMed
    1. Brittain HK, Scott R, Thomas E. The rise of the genome and personalised medicine. Clin Med (Lond) 2017;17(6):545–551. doi: 10.7861/clinmedicine.17-6-545. - DOI - PMC - PubMed
    1. Turnbull C, et al. The 100 000 Genomes Project: bringing whole genome sequencing to the NHS. BMJ. 2018;361:k1687. doi: 10.1136/bmj.k1687. - DOI - PubMed
    1. Boycott KM, et al. Care4Rare Canada: outcomes from a decade of network science for rare disease gene discovery. Am J Hum Genet. 2022;109(11):1947–1959. doi: 10.1016/j.ajhg.2022.10.002. - DOI - PMC - PubMed

Publication types