SVFX: a machine learning framework to quantify the pathogenicity of structural variants
- PMID: 33168059
- PMCID: PMC7650198
- DOI: 10.1186/s13059-020-02178-x
SVFX: a machine learning framework to quantify the pathogenicity of structural variants
Abstract
There is a lack of approaches for identifying pathogenic genomic structural variants (SVs) although they play a crucial role in many diseases. We present a mechanism-agnostic machine learning-based workflow, called SVFX, to assign pathogenicity scores to somatic and germline SVs. In particular, we generate somatic and germline training models, which include genomic, epigenomic, and conservation-based features, for SV call sets in diseased and healthy individuals. We then apply SVFX to SVs in cancer and other diseases; SVFX achieves high accuracy in identifying pathogenic SVs. Predicted pathogenic SVs in cancer cohorts are enriched among known cancer genes and many cancer-related pathways.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures





Similar articles
-
StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants.Am J Hum Genet. 2022 Feb 3;109(2):195-209. doi: 10.1016/j.ajhg.2021.12.007. Epub 2022 Jan 14. Am J Hum Genet. 2022. PMID: 35032432 Free PMC article.
-
NPSV: A simulation-driven approach to genotyping structural variants in whole-genome sequencing data.Gigascience. 2021 Jul 1;10(7):giab046. doi: 10.1093/gigascience/giab046. Gigascience. 2021. PMID: 34195837 Free PMC article.
-
Predicting pathogenic non-coding SVs disrupting the 3D genome in 1646 whole cancer genomes using multiple instance learning.Sci Rep. 2021 Jul 13;11(1):14411. doi: 10.1038/s41598-021-93917-y. Sci Rep. 2021. PMID: 34257393 Free PMC article.
-
Geographic distribution and adaptive significance of genomic structural variants: an anthropological genetics perspective.Hum Biol. 2014 Fall;86(4):260-75. doi: 10.13110/humanbiology.86.4.0260. Hum Biol. 2014. PMID: 25959693 Review.
-
Structural Variation in Cancer: Role, Prevalence, and Mechanisms.Annu Rev Genomics Hum Genet. 2022 Aug 31;23:123-152. doi: 10.1146/annurev-genom-120121-101149. Epub 2022 Jun 2. Annu Rev Genomics Hum Genet. 2022. PMID: 35655332 Review.
Cited by
-
Comprehensive and deep evaluation of structural variation detection pipelines with third-generation sequencing data.Genome Biol. 2024 Jul 15;25(1):188. doi: 10.1186/s13059-024-03324-5. Genome Biol. 2024. PMID: 39010145 Free PMC article.
-
A framework to score the effects of structural variants in health and disease.Genome Res. 2022 Apr;32(4):766-777. doi: 10.1101/gr.275995.121. Epub 2022 Feb 23. Genome Res. 2022. PMID: 35197310 Free PMC article.
-
Computational and experimental methods for classifying variants of unknown clinical significance.Cold Spring Harb Mol Case Stud. 2022 Apr 28;8(3):a006196. doi: 10.1101/mcs.a006196. Print 2022 Apr. Cold Spring Harb Mol Case Stud. 2022. PMID: 35483875 Free PMC article.
-
Scalable approaches for functional analyses of whole-genome sequencing non-coding variants.Hum Mol Genet. 2022 Oct 20;31(R1):R62-R72. doi: 10.1093/hmg/ddac191. Hum Mol Genet. 2022. PMID: 35943817 Free PMC article. Review.
-
Unified views on variant impact across many diseases.Trends Genet. 2023 Jun;39(6):442-450. doi: 10.1016/j.tig.2023.02.002. Epub 2023 Feb 28. Trends Genet. 2023. PMID: 36858880 Free PMC article. Review.
References
-
- Li Y, Roberts ND, Wala JA, Shapira O, Schumacher SE, Kumar K, et al. Patterns of somatic structural variation in human cancer genomes. Nature [Internet]. Nature Research. 2020;578:112–21. Available from: https://pubmed.ncbi.nlm.nih.gov/32025012/. [cited 2020 Oct 20]. - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous