Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 10;6(3):100469.
doi: 10.1016/j.xhgg.2025.100469. Epub 2025 Jun 16.

Identification of technically challenging variants: Whole-genome sequencing improves diagnostic yield in patients with high clinical suspicion of rare diseases

Affiliations

Identification of technically challenging variants: Whole-genome sequencing improves diagnostic yield in patients with high clinical suspicion of rare diseases

Hau-Yee Ng et al. HGG Adv. .

Abstract

The total burden of rare diseases is significant worldwide, with over 300 million people being affected. Many rare diseases have both well-defined clinical phenotypes and established genetic causes. However, a remarkable proportion of patients with high clinical suspicion of a rare disease remain genetically undiagnosed and stuck in the diagnostic odyssey after having a cascade of conventional genetic tests. One of the major factors contributing to this is that many types of variants are technically intractable to whole-exome sequencing (WES). In this study, the added diagnostic power of whole-genome sequencing (WGS) for patients with clinically suspected rare diseases was assessed by detecting technically challenging variants. 3,169 patients from the Hong Kong Genome Project (HKGP) were reviewed, identifying 322 individuals having high clinical suspicion of a rare disorder with well-established genetic etiology. Notably, 180 patients have performed at least one previous genetic test. Through PCR-free short-read WGS and a comprehensive in-house analytic pipeline, causative variants were found in 138 patients (138 of 322, 42.9%), 30 of which (30 of 138, 21.7%) are attributed to technically challenging variants. These included 6 variants in low-coverage regions with PCR bias, 2 deep intronic variants, 2 repeat expansions, 19 structural variants, and 2 variants in genes with a homologous pseudogene. The study demonstrated the indispensable diagnostic power of WGS in detecting technically challenging variants and the capability to serve as an all-in-one test for patients with high clinical suspicion of rare diseases.

Keywords: challenging variants; non-coding variants; pseudogenes; rare diseases; repeat expansions; structural variants; whole genome sequencing.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests W.-K.J.L. is a director of DRA.

Figures

Figure 1
Figure 1
Overview of study design Whole-genome sequencing was performed on a total of 3,169 probands. 322 affected individuals were selected for analysis because they have a highly clinically suspected disease. Among the selected individuals, 138 of them have a positive WGS result, and 30 of these positive affected individuals were detected with technically challenging variants.
Figure 2
Figure 2
Overview of the types of technically challenging variants Technically challenging variants detected in the patients were categorized into four groups. Group 1: SNVs including exonic variants located in low-coverage regions with PCR bias and intronic variants located >50 bp from the exon-intron junction. Group 2: repeat expansions. Group 3: structural variants with lengths >50 bp. Group 4: variant in genes with pseudogenes sharing high sequence similarity, which impacts the ability of next-generation sequencing (NGS) to accurately detect and map the variants. SNV, single-nucleotide variant.
Figure 3
Figure 3
PKD1 variant locating in low-coverage region with PCR bias (A) Coverage data from the Genome Aggregation Database (gnomAD) v.4.1.0 showed the fraction of individuals with coverage over 20× in the genomic region chr16: 2119112–2119141, covering the PKD1 c.341_345del variant. The coverage of WES is lower and less uniform than that of WGS in this region. (B) Comparison between the coverage around the c.341_345del variant in the WGS result of P5 and the WES result of one participant in gnomAD. Coverage of WGS result showed more uniform coverage than WES. The IGV reads are colored by read strand.
Figure 4
Figure 4
Identification of a paternally inherited NF1 variant, c.1392+751T>G (p.?) The likely pathogenic NF1 c.1392+751T>G (p.?) variant detected in P8 is predicted to create a deleterious effect by SpliceAI. (A) Pedigree of P8 showing a strong paternal history of NF1. (B) IGV screenshot showing the heterozygous NF1 variant in the proband (IV.1), affected father (III.2), and brother (IV.2). The unaffected mother (III.3) does not carry the variant. Reads are colored by read strand. (C) Schematic diagram of the predicted effect of NF1 deep intronic variant in P8. The T-to-G mutation created a cryptic donor site and was predicted to cause retention of 750 bp of intron 12, which is expected to introduce a premature stop codon (TAA colored in red).
Figure 5
Figure 5
Identification of an ATXN2 repeat expansion ExpansionHunter identified a pathogenic expansion in ATXN2 in P9. The pileup plot in ATXN2 of P9 shows a heterozygous 30 GCT repeat expansion in exon 1 (95% CI: 30–30). There are multiple supporting reads spanning the whole region of the ATXN2 repeat expansion (highlighted in orange) and the upstream and downstream flanking sequences (highlighted in blue).
Figure 6
Figure 6
Effect of the LINE1 insertion on splicing of SLC25A13 (A) IGV visualization of WGS data of P11 revealed the pathogenic LINE1 insertion in intron 4 of SLC25A13. (B) Schematic diagram of two major alternative splicing variants generated after the LINE1 insertion. The major alternative transcript 1 skipped exons 4 and 5, while transcript 2 skipped exons 3–5. Both resulted in the creation of an out-of-frame transcript and are predicted to undergo NMD.
Figure 7
Figure 7
Examples of structural variants detected in this study A paternally inherited inversion affecting PAX3 was found in P13, and a HSD17B3 deletion was found in P28. (A) Pedigree of P13 showing a strong paternal history of Waardenburg syndrome. (B) IGV screenshot showing the heterozygous inversion in the proband (IV.6), affected father (III.8), and paternal grandmother (II.5) but not the unaffected mother (III.9). The breakpoint in chr2:222273459 located in intron 4 of PAX3 resulted in truncation of PAX3. (C) Schematic diagram showing the similar cytoband pattern between normal chromosome 2 and chromosome 2 with the inversion. The inversion is not likely to be detectable by conventional karyotyping. (D) IGV screenshot of the heterozygous HSD17B3 deletion in P28. The deletion spanned ∼1.6 kb, removing the entire exon 1 of HSD17B3. The genomic coordinates of all breakpoints are reported above the schematic diagrams.

References

    1. Health, T.L.G. The landscape for rare diseases in 2024. Lancet Glob. Heal. 2024;12 doi: 10.1016/s2214-109x(24)00056-1. - DOI - PubMed
    1. Chung C.C.Y., Hong Kong Genome Project. Chu A.T.W., Chung B.H.Y. Rare disease emerging as a global public health priority. Front. Public Health. 2022;10 doi: 10.3389/fpubh.2022.1028545. - DOI - PMC - PubMed
    1. Boycott K.M., Hartley T., Biesecker L.G., Gibbs R.A., Innes A.M., Riess O., Belmont J., Dunwoodie S.L., Jojic N., Lassmann T., et al. A Diagnosis for All Rare Genetic Diseases: The Horizon and the Next Frontiers. Cell. 2019;177:32–37. doi: 10.1016/j.cell.2019.02.040. - DOI - PubMed
    1. Lincoln S.E., Hambuch T., Zook J.M., Bristow S.L., Hatchell K., Truty R., Kennemer M., Shirts B.H., Fellowes A., Chowdhury S., et al. One in seven pathogenic variants can be challenging to detect by NGS: an analysis of 450,000 patients with implications for clinical sensitivity and genetic test implementation. Genet. Med. 2021;23:1673–1680. doi: 10.1038/s41436-021-01187-w. - DOI - PMC - PubMed
    1. AlAbdi L., Maddirevula S., Shamseldin H.E., Khouj E., Helaby R., Hamid H., Almulhim A., Hashem M.O., Abdulwahab F., Abouyousef O., et al. Diagnostic implications of pitfalls in causal variant identification based on 4577 molecularly characterized families. Nat. Commun. 2023;14:5269. doi: 10.1038/s41467-023-40909-3. - DOI - PMC - PubMed