Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 18;26(6):2725.
doi: 10.3390/ijms26062725.

ONT in Clinical Diagnostics of Repeat Expansion Disorders: Detection and Reporting Challenges

Affiliations

ONT in Clinical Diagnostics of Repeat Expansion Disorders: Detection and Reporting Challenges

Ludmila Kaplun et al. Int J Mol Sci. .

Abstract

While whole-genome sequencing (WGS) using short-read technology has become a standard diagnostic test, this technology has limitations in analyzing certain genomic regions, particularly short tandem repeats (STRs). These repetitive sequences are associated with over 50 diseases, primarily affecting neurological function, including Huntington disease, frontotemporal dementia, and Friedreich's ataxia. We analyzed 2689 cases with movement disorders and dementia-related phenotypes processed at Variantyx in 2023-2024 using a two-tiered approach, with an initial short-read WGS followed by ONT long-read sequencing (when necessary) for variant characterization. Of the 2038 cases (75.8%) with clinically relevant genetic variants, 327 (16.0%) required additional long-read analysis. STR variants were reported in 338 cases (16.6% of positive cases), with approximately half requiring long-read sequencing for definitive classification. The combined approach enabled the precise determination of repeat length, composition, somatic mosaicism, and methylation status. Notable advantages included the detection of complex repeat structures in several genes such as RFC1, FGF14, and FXN, where long-read sequencing allowed to determine somatic repeat unit variations and accurate allele phasing. Further studies are needed to establish technology-specific guidelines for the standardized interpretation of long-read sequencing data for the clinical diagnostics of repeat expansion disorders.

Keywords: Nanopore; ONT; WGS; ataxia; dementia; genetic testing; long reads; repeat expansion.

PubMed Disclaimer

Conflict of interest statement

Authors Ludmila Kaplun, Greice Krautz-Peterson, Nir Neerman, Yocheved Schindler, Elinor Dehan, Claudia S. Huettner, Brett K. Baumgartner, Christine Stanley, and Alexander Kaplun were employed by the company Variantyx, Framingham, MA 01701, USA.

Figures

Figure 1
Figure 1
Diagnostic value of STR genotypes using ONT long-read sequencing. Axis X lists STR loci; axis Y indicates % of each category of each STR out of all detected STR variants combined.
Figure 2
Figure 2
RFC1, biallelic expansion. (a) Visualization of short-read sequencing results, showing reads flanking the STR regions and those fully contained within the repeat. The bioinformatically predicted genotype based on read statistics is 49/49 * repeats. (b) Visualization of ONT long-read sequencing data, depicting a biallelic expansion of >200/>830 repeats, primarily composed of AAGGG pathogenic repeat units with occasional interrupting sequences. Green-underlined sequences highlight the regions flanking the STR repeat. Due to read length limitations, most of the reads are not able to capture both flanking regions of this repeat expansion. Different repeat units are highlighted in different colors: AAGGG (red), AAAGG (blue), AGAAG (yellow). (c) Integrative Genomic Viewer (IGV) visualization of ONT long-read sequencing data over the RFC1 target region. Purple rectangles indicate insertions. * Results of bioinformatic prediction not depicted in the image.
Figure 3
Figure 3
FGF14, heterozygous variant. (a) Visualization of short-read sequencing results, showing reads flanking the STR regions, as well as those fully contained within the repeat. The bioinformatically predicted genotype based on read statistics is 36/105 * repeats. (b) Visualization of the ONT long-read sequencing results, depicting a heterozygous repeat expansion of 41/420–481 repeats. The expansion shows mosaicism in both length and sequence, with some reads incorporating long stretches of GGA repeats in addition to the canonical GAA. Green-underlined sequences highlight the regions flanking the STR repeat. Different repeat units are highlighted in different colors: GAA (yellow), GGA (blue). The sequence shown in the upper panel continues in the lower panel. Due to limitations of the space and the length of the reads, some of them do not have both flanks visible in the included image. * Results of bioinformatic prediction not depicted in the image.
Figure 4
Figure 4
FGF14, with a noncanonical GAAGGA repeat unit. (a) Visualization of short-read sequencing results showing reads flanking the STR regions as well as those fully contained within the repeat. The bioinformatically predicted genotype based on read statistics is 36/56 * repeats. (b) Visualization of the ONT long-read sequencing results, depicting a heterozygous repeat expansion with 35–37/318–342 repeats, where the shorter allele is composed of canonical GAA repeats, while the longer one is predominantly composed of GAAGGA repeats, followed by a short stretch of canonical GAA. Green underlined sequences highlight the regions flanking the STR repeat. Different repeat units are highlighted in different colors: GAA (yellow), GGA (blue). Due to limitations of the space and the length of the reads, some of them do not have both flanks visible in the included image. * Result of bioinformatic prediction, not depicted in the image.
Figure 5
Figure 5
FXN, biallelic repeat expansion. (a) Visualization of short-read sequencing results, showing reads flanking the STR regions as well as those fully contained within the repeat. The bioinformatically predicted genotype based on the read statistics is 76/76 * repeats. (b) Visualization of the ONT long-read sequencing results, depicting a biallelic repeat expansion with one allele of 92–104 repeats and another one of 649–901, composed mainly of GAA units (highlighted in yellow) with occasional GGA interruptions (highlighted in blue). The sequence shown in the upper panel continues in the lower panel. Green-underlined sequences indicate regions flanking the STR repeat. Due to limitations of the space and the length of the reads, some of them do not have both flanks visible in the included image. (c) Integrative Genomic Viewer (IGV) visualization of ONT long-read sequencing data over the FXN target region. Purple rectangles indicate insertions. * Results of bioinformatic prediction not depicted in the image.
Figure 6
Figure 6
FMR1, male, permutation. (a) Visualization of short-read sequencing results showing reads flanking the STR regions, as well as those fully contained within the repeat. The bioinformatically predicted genotype based on read statistics is 156 * repeats. (b) Visualization of the ONT long-read sequencing results, showing an expansion of 130–132 repeats (highlighted in yellow). Green-underlined sequences indicate regions flanking the STR repeat. The sequence shown in the upper panel continues in the lower panel. Due to limitations of the space and the length of the reads, some of them do not have both flanks visible in the included image. (c) Integrative Genomic Viewer (IGV) visualization of the ONT long-read sequencing results with cytosines in CpG context color coded according to their epigenetic status: red for methylated and blue for unmethylated cytosines. No 5mC methylation is observed in this region. * Results of bioinformatic prediction not depicted in the image.
Figure 7
Figure 7
FMR1 mosaic expansion, male. (a) Visualization of short-read sequencing results showing reads flanking the STR regions, as well as those fully contained within the repeat. The bioinformatically predicted genotype based on the read statistics is 81 repeats *. (b) Visualization of the ONT long-read sequencing results showing a total expansion length of 270–400 repeats, including 44–396 canonical CGG units and stretches of various noncanonical repeats in some of the reads. Green-underlined sequences indicate regions flanking the STR repeat. Different repeat units are highlighted in different colors: CGG (yellow), TGG (red), AGG (blue), CGCG (green). Due to limitations of the space and the length of the reads, some of them do not have both flanks visible in the included image. (c) Integrative Genomic Viewer (IGV) visualization of the ONT long-read sequencing results with cytosines in CpG context color coded according to their epigenetic status—red for methylated and blue for unmethylated cytosines. 5mC methylation is observed on the reads with expanded canonical repeats but only in the absence of long stretches of TGG, AGG, CAA interruptions. * Results of bioinformatic prediction not depicted in the image.
Figure 8
Figure 8
ATXN8OS, heterozygous repeat expansion, complex region with CTA-CTG structure. (a) Visualization of short-read sequencing results showing flanking the STR regions, as well as those and reads fully contained within the repeat. The bioinformatically predicted genotype based on the read statistics is 9 + 9/12 + 79 * repeats (CAT + CTG/CTA + CTG complex region). (b) Visualization of the ONT long-reads sequencing results showing total expansion length and structure—9 CAT + 9 CTG repeats (shorter, normal allele) and 12 CTA + 83–84 CTG repeats (expanded allele). Green-underlined sequences indicate regions flanking the STR repeat. Different repeat units are highlighted in different colors: CTG (yellow), CTA (blue), GTC (red). * Results of bioinformatic prediction not depicted in the image.

Similar articles

Cited by

References

    1. Wigby K.M., Brockman D., Costain G., Hale C., Taylor S.L., Belmont J., Bick D., Dimmock D., Fernbach S., Greally J., et al. Evidence Review and Considerations for Use of First Line Genome Sequencing to Diagnose Rare Genetic Disorders. NPJ Genom. Med. 2024;9:15. doi: 10.1038/s41525-024-00396-x. - DOI - PMC - PubMed
    1. van der Sanden B.P.G.H., Schobers G., Corominas Galbany J., Koolen D.A., Sinnema M., van Reeuwijk J., Stumpel C.T.R.M., Kleefstra T., de Vries B.B.A., Ruiterkamp-Versteeg M., et al. The Performance of Genome Sequencing as a First-Tier Test for Neurodevelopmental Disorders. Eur. J. Human. Genet. 2022;31:81–88. doi: 10.1038/s41431-022-01185-9. - DOI - PMC - PubMed
    1. Rajan-Babu I.-S., Peng J.J., Chiu R., Birch P., Couse M., Guimond C., Lehman A., Mwenifumbo J., van Karnebeek C., Friedman J., et al. Genome-Wide Sequencing as a First-Tier Screening Test for Short Tandem Repeat Expansions. Genome Med. 2021;13:126. doi: 10.1186/s13073-021-00932-9. - DOI - PMC - PubMed
    1. Billingsley K.J., Meredith M., Daida K., Alvarez Jerez P., Negi S., Malik L., Genner R.M., Moller A., Zheng X., Gibson S.B., et al. Long-Read Sequencing of Hundreds of Diverse Brains Provides Insight into the Impact of Structural Variation on Gene Expression and DNA Methylation. Preprint. bioRxiv. 2024 doi: 10.1101/2024.12.16.628723. - DOI
    1. Gustafson J.A., Gibson S.B., Damaraju N., Zalusky M.P., Hoekzema K., Twesigomwe D., Yang L., Snead A.A., Richmond P.A., De Coster W., et al. High-Coverage Nanopore Sequencing of Samples from the 1000 Genomes Project to Build a Comprehensive Catalog of Human Genetic Variation. Genome Res. 2024;34:2061. doi: 10.1101/gr.279273.124. - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources