A review of genetic variant databases and machine learning tools for predicting the pathogenicity of breast cancer
- PMID: 38149678
- PMCID: PMC10782903
- DOI: 10.1093/bib/bbad479
A review of genetic variant databases and machine learning tools for predicting the pathogenicity of breast cancer
Abstract
Studies continue to uncover contributing risk factors for breast cancer (BC) development including genetic variants. Advances in machine learning and big data generated from genetic sequencing can now be used for predicting BC pathogenicity. However, it is unclear which tool developed for pathogenicity prediction is most suited for predicting the impact and pathogenicity of variant effects. A significant challenge is to determine the most suitable data source for each tool since different tools can yield different prediction results with different data inputs. To this end, this work reviews genetic variant databases and tools used specifically for the prediction of BC pathogenicity. We provide a description of existing genetic variants databases and, where appropriate, the diseases for which they have been established. Through example, we illustrate how they can be used for prediction of BC pathogenicity and discuss their associated advantages and disadvantages. We conclude that the tools that are specialized by training on multiple diverse datasets from different databases for the same disease have enhanced accuracy and specificity and are thereby more helpful to the clinicians in predicting and diagnosing BC as early as possible.
Keywords: artificial intelligence; breast cancer; data science; genetic variants database; machine learning; pathogenicity prediction.
© The Author(s) 2023. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Figures
References
-
- Hinton GE, Osindero S, Teh Y-W. A fast learning algorithm for deep belief nets. Neural Comput 2006;18(7):1527–54. - PubMed
-
- Lecun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44. - PubMed
-
- Shendure J, Balasubramanian S, Church GM, et al. DNA sequencing at 40: past, present and future. Nature 2017;550(7676):345–353. - PubMed
-
- National Cancer Institute . Cancer Stat Facts: Common Cancer Sites. 2022.
-
- Siegel RL, Miller KD, Jemal A. Cancer statistics, 2019. CA Cancer J Clin 2019;69(1):7–34. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
