Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions
- PMID: 33226985
- PMCID: PMC7721175
- DOI: 10.1371/journal.pcbi.1008397
Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions
Abstract
Genetic diseases are driven by aberrations of the human genome. Identification of such aberrations including structural variations (SVs) is key to our understanding. Conventional short-reads whole genome sequencing (cWGS) can identify SVs to base-pair resolution, but utilizes only short-range information and suffers from high false discovery rate (FDR). Linked-reads sequencing (10XWGS) utilizes long-range information by linkage of short-reads originating from the same large DNA molecule. This can mitigate alignment-based artefacts especially in repetitive regions and should enable better prediction of SVs. However, an unbiased evaluation of this technology is not available. In this study, we performed a comprehensive analysis of different types and sizes of SVs predicted by both the technologies and validated with an independent PCR based approach. The SVs commonly identified by both the technologies were highly specific, while validation rate dropped for uncommon events. A particularly high FDR was observed for SVs only found by 10XWGS. To improve FDR and sensitivity, statistical models for both the technologies were trained. Using our approach, we characterized SVs from the MCF7 cell line and a primary breast cancer tumor with high precision. This approach improves SV prediction and can therefore help in understanding the underlying genetics in various diseases.
Conflict of interest statement
I have read the journal's policy and the authors of this manuscript have the following competing interests: Ugur Sahin is co-founder and shareholder of TRON, co-founder and CEO of BioNTech SE.
Figures




Similar articles
-
Oxford Nanopore and Bionano Genomics technologies evaluation for plant structural variation detection.BMC Genomics. 2022 Apr 21;23(1):317. doi: 10.1186/s12864-022-08499-4. BMC Genomics. 2022. PMID: 35448948 Free PMC article.
-
Evaluation of Single-Molecule Sequencing Technologies for Structural Variant Detection in Two Swedish Human Genomes.Genes (Basel). 2020 Nov 30;11(12):1444. doi: 10.3390/genes11121444. Genes (Basel). 2020. PMID: 33266238 Free PMC article.
-
Targeted short read sequencing and assembly of re-arrangements and candidate gene loci provide megabase diplotypes.Nucleic Acids Res. 2019 Nov 4;47(19):e115. doi: 10.1093/nar/gkz661. Nucleic Acids Res. 2019. PMID: 31350896 Free PMC article.
-
A survey of algorithms for the detection of genomic structural variants from long-read sequencing data.Nat Methods. 2023 Aug;20(8):1143-1158. doi: 10.1038/s41592-023-01932-w. Epub 2023 Jun 29. Nat Methods. 2023. PMID: 37386186 Free PMC article. Review.
-
The impact of long-read sequencing on human population-scale genomics.Genome Res. 2025 Apr 14;35(4):593-598. doi: 10.1101/gr.280120.124. Genome Res. 2025. PMID: 40228902 Review.
Cited by
-
CLAW: An automated Snakemake workflow for the assembly of chloroplast genomes from long-read data.PLoS Comput Biol. 2024 Feb 9;20(2):e1011870. doi: 10.1371/journal.pcbi.1011870. eCollection 2024 Feb. PLoS Comput Biol. 2024. PMID: 38335225 Free PMC article.
-
Application of long-read sequencing to the detection of structural variants in human cancer genomes.Comput Struct Biotechnol J. 2021 Jul 28;19:4207-4216. doi: 10.1016/j.csbj.2021.07.030. eCollection 2021. Comput Struct Biotechnol J. 2021. PMID: 34527193 Free PMC article. Review.
-
The Bioinformatic Applications of Hi-C and Linked Reads.Genomics Proteomics Bioinformatics. 2024 Oct 15;22(4):qzae048. doi: 10.1093/gpbjnl/qzae048. Genomics Proteomics Bioinformatics. 2024. PMID: 38905513 Free PMC article. Review.
-
In it for the long run: perspectives on exploiting long-read sequencing in livestock for population scale studies of structural variants.Genet Sel Evol. 2023 Jan 31;55(1):9. doi: 10.1186/s12711-023-00783-5. Genet Sel Evol. 2023. PMID: 36721111 Free PMC article. Review.
-
The landscape of T cell antigens for cancer immunotherapy.Nat Cancer. 2023 Jul;4(7):937-954. doi: 10.1038/s43018-023-00588-x. Epub 2023 Jul 6. Nat Cancer. 2023. PMID: 37415076 Review.
References
-
- Nowell C. The minute chromosome (Ph1) in chronic granulocytic leukemia. Blut 1962; 8(2):65–6. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources