Recent advances of automated methods for searching and extracting genomic variant information from biomedical literature
- PMID: 32770181
- PMCID: PMC8138883
- DOI: 10.1093/bib/bbaa142
Recent advances of automated methods for searching and extracting genomic variant information from biomedical literature
Abstract
Motivation: To obtain key information for personalized medicine and cancer research, clinicians and researchers in the biomedical field are in great need of searching genomic variant information from the biomedical literature now than ever before. Due to the various written forms of genomic variants, however, it is difficult to locate the right information from the literature when using a general literature search system. To address the difficulty of locating genomic variant information from the literature, researchers have suggested various solutions based on automated literature-mining techniques. There is, however, no study for summarizing and comparing existing tools for genomic variant literature mining in terms of how to search easily for information in the literature on genomic variants.
Results: In this article, we systematically compared currently available genomic variant recognition and normalization tools as well as the literature search engines that adopted these literature-mining techniques. First, we explain the problems that are caused by the use of non-standard formats of genomic variants in the PubMed literature by considering examples from the literature and show the prevalence of the problem. Second, we review literature-mining tools that address the problem by recognizing and normalizing the various forms of genomic variants in the literature and systematically compare them. Third, we present and compare existing literature search engines that are designed for a genomic variant search by using the literature-mining techniques. We expect this work to be helpful for researchers who seek information about genomic variants from the literature, developers who integrate genomic variant information from the literature and beyond.
Keywords: genomic variant; literature mining; literature search; mutation; named-entity normalization; named-entity recognition.
© The Author(s) 2020. Published by Oxford University Press.
Figures


Similar articles
-
tmVar 3.0: an improved variant concept recognition and normalization tool.Bioinformatics. 2022 Sep 15;38(18):4449-4451. doi: 10.1093/bioinformatics/btac537. Bioinformatics. 2022. PMID: 35904569 Free PMC article.
-
pubmedKB: an interactive web server for exploring biomedical entity relations in the biomedical literature.Nucleic Acids Res. 2022 Jul 5;50(W1):W616-W622. doi: 10.1093/nar/gkac310. Nucleic Acids Res. 2022. PMID: 35536289 Free PMC article.
-
tmVar 2.0: integrating genomic variant information from literature with dbSNP and ClinVar for precision medicine.Bioinformatics. 2018 Jan 1;34(1):80-87. doi: 10.1093/bioinformatics/btx541. Bioinformatics. 2018. PMID: 28968638 Free PMC article.
-
PubMed and beyond: a survey of web tools for searching biomedical literature.Database (Oxford). 2011 Jan 18;2011:baq036. doi: 10.1093/database/baq036. Print 2011. Database (Oxford). 2011. PMID: 21245076 Free PMC article. Review.
-
PubMed and beyond: biomedical literature search in the age of artificial intelligence.EBioMedicine. 2024 Feb;100:104988. doi: 10.1016/j.ebiom.2024.104988. Epub 2024 Feb 1. EBioMedicine. 2024. PMID: 38306900 Free PMC article. Review.
Cited by
-
Variomes: a high recall search engine to support the curation of genomic variants.Bioinformatics. 2022 Apr 28;38(9):2595-2601. doi: 10.1093/bioinformatics/btac146. Bioinformatics. 2022. PMID: 35274687 Free PMC article.
-
Automatic extraction of ranked SNP-phenotype associations from text using a BERT-LSTM-based method.BMC Bioinformatics. 2023 Apr 12;24(1):144. doi: 10.1186/s12859-023-05236-w. BMC Bioinformatics. 2023. PMID: 37046202 Free PMC article.
-
tmVar 3.0: an improved variant concept recognition and normalization tool.Bioinformatics. 2022 Sep 15;38(18):4449-4451. doi: 10.1093/bioinformatics/btac537. Bioinformatics. 2022. PMID: 35904569 Free PMC article.
-
ViMRT: a text-mining tool and search engine for automated virus mutation recognition.Bioinformatics. 2023 Jan 1;39(1):btac721. doi: 10.1093/bioinformatics/btac721. Bioinformatics. 2023. PMID: 36342236 Free PMC article.
-
VarChat: the generative AI assistant for the interpretation of human genomic variations.Bioinformatics. 2024 Mar 29;40(4):btae183. doi: 10.1093/bioinformatics/btae183. Bioinformatics. 2024. PMID: 38579245 Free PMC article.