Forecasting risk gene discovery in autism with machine learning and genome-scale data
- PMID: 32165711
- PMCID: PMC7067874
- DOI: 10.1038/s41598-020-61288-5
Forecasting risk gene discovery in autism with machine learning and genome-scale data
Erratum in
-
Author Correction: Forecasting risk gene discovery in autism with machine learning and genome-scale data.Sci Rep. 2020 Nov 26;10(1):20994. doi: 10.1038/s41598-020-77832-2. Sci Rep. 2020. PMID: 33244169 Free PMC article.
Abstract
Genetics has been one of the most powerful windows into the biology of autism spectrum disorder (ASD). It is estimated that a thousand or more genes may confer risk for ASD when functionally perturbed, however, only around 100 genes currently have sufficient evidence to be considered true "autism risk genes". Massive genetic studies are currently underway producing data to implicate additional genes. This approach - although necessary - is costly and slow-moving, making identification of putative ASD risk genes with existing data vital. Here, we approach autism risk gene discovery as a machine learning problem, rather than a genetic association problem, by using genome-scale data as predictors to identify new genes with similar properties to established autism risk genes. This ensemble method, forecASD, integrates brain gene expression, heterogeneous network data, and previous gene-level predictors of autism association into an ensemble classifier that yields a single score indexing evidence of each gene's involvement in the etiology of autism. We demonstrate that forecASD has substantially better performance than previous predictors of autism association in three independent trio-based sequencing studies. Studying forecASD prioritized genes, we show that forecASD is a robust indicator of a gene's involvement in ASD etiology, with diverse applications to gene discovery, differential expression analysis, eQTL prioritization, and pathway enrichment analysis.
Conflict of interest statement
The authors declare no competing interests.
Figures





Similar articles
-
Brain-specific functional relationship networks inform autism spectrum disorder gene prediction.Transl Psychiatry. 2018 Mar 6;8(1):56. doi: 10.1038/s41398-018-0098-6. Transl Psychiatry. 2018. PMID: 29507298 Free PMC article.
-
"Guilt by association" is not competitive with genetic association for identifying autism risk genes.Sci Rep. 2021 Aug 5;11(1):15950. doi: 10.1038/s41598-021-95321-y. Sci Rep. 2021. PMID: 34354131 Free PMC article.
-
A Bayesian framework to integrate multi-level genome-scale data for Autism risk gene prioritization.BMC Bioinformatics. 2022 Apr 22;23(1):146. doi: 10.1186/s12859-022-04616-y. BMC Bioinformatics. 2022. PMID: 35459094 Free PMC article.
-
Genetics of autism spectrum disorder.Handb Clin Neurol. 2018;147:321-329. doi: 10.1016/B978-0-444-63233-3.00021-X. Handb Clin Neurol. 2018. PMID: 29325621 Review.
-
Genetic architecture of autism spectrum disorder: Lessons from large-scale genomic studies.Neurosci Biobehav Rev. 2021 Sep;128:244-257. doi: 10.1016/j.neubiorev.2021.06.028. Epub 2021 Jun 21. Neurosci Biobehav Rev. 2021. PMID: 34166716 Review.
Cited by
-
Safety and target engagement of an oral small-molecule sequestrant in adolescents with autism spectrum disorder: an open-label phase 1b/2a trial.Nat Med. 2022 Mar;28(3):528-534. doi: 10.1038/s41591-022-01683-9. Epub 2022 Feb 14. Nat Med. 2022. PMID: 35165451 Clinical Trial.
-
Whole-genome sequencing in a family with twin boys with autism and intellectual disability suggests multimodal polygenic risk.Cold Spring Harb Mol Case Stud. 2018 Dec 17;4(6):a003285. doi: 10.1101/mcs.a003285. Print 2018 Dec. Cold Spring Harb Mol Case Stud. 2018. PMID: 30559312 Free PMC article.
-
Integration of genome-scale data identifies candidate sleep regulators.Sleep. 2023 Feb 8;46(2):zsac279. doi: 10.1093/sleep/zsac279. Sleep. 2023. PMID: 36462188 Free PMC article.
-
SFARI genes and where to find them; modelling Autism Spectrum Disorder specific gene expression dysregulation with RNA-seq data.Sci Rep. 2022 Jun 16;12(1):10158. doi: 10.1038/s41598-022-14077-1. Sci Rep. 2022. PMID: 35710789 Free PMC article.
-
A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates.Front Genet. 2020 Sep 10;11:500064. doi: 10.3389/fgene.2020.500064. eCollection 2020. Front Genet. 2020. PMID: 33133139 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical