A framework to score the effects of structural variants in health and disease
- PMID: 35197310
- PMCID: PMC8997355
- DOI: 10.1101/gr.275995.121
A framework to score the effects of structural variants in health and disease
Abstract
Although technological advances improved the identification of structural variants (SVs) in the human genome, their interpretation remains challenging. Several methods utilize individual mechanistic principles like the deletion of coding sequence or 3D genome architecture disruptions. However, a comprehensive tool using the broad spectrum of available annotations is missing. Here, we describe CADD-SV, a method to retrieve and integrate a wide set of annotations to predict the effects of SVs. Previously, supervised learning approaches were limited due to a small number and biased set of annotated pathogenic or benign SVs. We overcome this problem by using a surrogate training objective, the Combined Annotation Dependent Depletion (CADD) of functional variants. We use human- and chimpanzee-derived SVs as proxy-neutral and contrast them with matched simulated variants as proxy-deleterious, an approach that has proven powerful for short sequence variants. Our tool computes summary statistics over diverse variant annotations and uses random forest models to prioritize deleterious structural variants. The resulting CADD-SV scores correlate with known pathogenic and rare population variants. We further show that we can prioritize somatic cancer variants as well as noncoding variants known to affect gene expression. We provide a website and offline-scoring tool for easy application of CADD-SV.
© 2022 Kleinert and Kircher; Published by Cold Spring Harbor Laboratory Press.
Figures





Similar articles
-
StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants.Am J Hum Genet. 2022 Feb 3;109(2):195-209. doi: 10.1016/j.ajhg.2021.12.007. Epub 2022 Jan 14. Am J Hum Genet. 2022. PMID: 35032432 Free PMC article.
-
nanotatoR: a tool for enhanced annotation of genomic structural variants.BMC Genomics. 2021 Jan 6;22(1):10. doi: 10.1186/s12864-020-07182-w. BMC Genomics. 2021. PMID: 33407088 Free PMC article.
-
CADD: predicting the deleteriousness of variants throughout the human genome.Nucleic Acids Res. 2019 Jan 8;47(D1):D886-D894. doi: 10.1093/nar/gky1016. Nucleic Acids Res. 2019. PMID: 30371827 Free PMC article.
-
Structural variant detection in cancer genomes: computational challenges and perspectives for precision oncology.NPJ Precis Oncol. 2021 Mar 2;5(1):15. doi: 10.1038/s41698-021-00155-6. NPJ Precis Oncol. 2021. PMID: 33654267 Free PMC article. Review.
-
Structural variation in the 3D genome.Nat Rev Genet. 2018 Jul;19(7):453-467. doi: 10.1038/s41576-018-0007-0. Nat Rev Genet. 2018. PMID: 29692413 Review.
Cited by
-
DeepSVP: integration of genotype and phenotype for structural variant prioritization using deep learning.Bioinformatics. 2022 Mar 4;38(6):1677-1684. doi: 10.1093/bioinformatics/btab859. Bioinformatics. 2022. PMID: 34951628 Free PMC article.
-
Rare pathogenic structural variants show potential to enhance prostate cancer germline testing for African men.Res Sq [Preprint]. 2024 Jun 13:rs.3.rs-4531885. doi: 10.21203/rs.3.rs-4531885/v1. Res Sq. 2024. Update in: Nat Commun. 2025 Mar 10;16(1):2400. doi: 10.1038/s41467-025-57312-9. PMID: 38947031 Free PMC article. Updated. Preprint.
-
Rare pathogenic structural variants show potential to enhance prostate cancer germline testing for African men.Nat Commun. 2025 Mar 10;16(1):2400. doi: 10.1038/s41467-025-57312-9. Nat Commun. 2025. PMID: 40064858 Free PMC article.
-
Rare diseases: human genome research is coming home.Cold Spring Harb Mol Case Stud. 2022 Mar 24;8(2):a006210. doi: 10.1101/mcs.a006210. Print 2022 Feb. Cold Spring Harb Mol Case Stud. 2022. PMID: 35332074 Free PMC article.
-
TADA-a machine learning tool for functional annotation-based prioritisation of pathogenic CNVs.Genome Biol. 2022 Mar 1;23(1):67. doi: 10.1186/s13059-022-02631-z. Genome Biol. 2022. PMID: 35232478 Free PMC article.
References
-
- Beyter D, Ingimundardottir H, Oddsson A, Eggertsson HP, Bjornsson E, Jonsson H, Atlason BA, Kristmundsdottir S, Mehringer S, Hardarson MT, et al. 2021. Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits. Nat Genet 53: 779–786. 10.1038/s41588-021-00865-4 - DOI - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources