Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 May 20;22(3):bbaa142.
doi: 10.1093/bib/bbaa142.

Recent advances of automated methods for searching and extracting genomic variant information from biomedical literature

Affiliations
Review

Recent advances of automated methods for searching and extracting genomic variant information from biomedical literature

Kyubum Lee et al. Brief Bioinform. .

Abstract

Motivation: To obtain key information for personalized medicine and cancer research, clinicians and researchers in the biomedical field are in great need of searching genomic variant information from the biomedical literature now than ever before. Due to the various written forms of genomic variants, however, it is difficult to locate the right information from the literature when using a general literature search system. To address the difficulty of locating genomic variant information from the literature, researchers have suggested various solutions based on automated literature-mining techniques. There is, however, no study for summarizing and comparing existing tools for genomic variant literature mining in terms of how to search easily for information in the literature on genomic variants.

Results: In this article, we systematically compared currently available genomic variant recognition and normalization tools as well as the literature search engines that adopted these literature-mining techniques. First, we explain the problems that are caused by the use of non-standard formats of genomic variants in the PubMed literature by considering examples from the literature and show the prevalence of the problem. Second, we review literature-mining tools that address the problem by recognizing and normalizing the various forms of genomic variants in the literature and systematically compare them. Third, we present and compare existing literature search engines that are designed for a genomic variant search by using the literature-mining techniques. We expect this work to be helpful for researchers who seek information about genomic variants from the literature, developers who integrate genomic variant information from the literature and beyond.

Keywords: genomic variant; literature mining; literature search; mutation; named-entity normalization; named-entity recognition.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Bodies of literature that contain genomic variants in PubMed and PMC Open Access Subset (Accessed in February 2020). Normalized forms of genomic variants (RSID + HGVS) are only 25%. Please note that due to the PMC embargo policy, some of the full-text articles of 2019 are unavailable.
Figure 2
Figure 2
Comparison between general search engine and search engines with NER for genomic variant search. Due to various notations of genomic variants, general search engines typically need to use multiple queries to find relevant publications. The NER module, however, can find and normalize genomic variants from publications and query inputs and thus can provide more complete and precise search results.

Similar articles

Cited by

References

    1. Malone ER, Oliva M, Sabatini, PJB, et al.. Molecular profiling for precision cancer therapies. Genome Med. 2020;12(1):8. - PMC - PubMed
    1. Aronson SJ, Rehm HL. Building the foundation for genomics in precision medicine. Nature 2015;526(7573):336–42. - PMC - PubMed
    1. Gough NR. Focus issue: from genomic mutations to oncogenic pathways. Sci Signal 2013;6(268):eg3. - PubMed
    1. Mellman I, Coukos G, Dranoff G. Cancer immunotherapy comes of age. Nature 2011;480(7378):480–9. - PMC - PubMed
    1. Fiorini N, Leaman R, Lipman DJ, et al. . How user intelligence is improving PubMed. Nat Biotechnol 2018;36(10):937–45. - PubMed

Publication types