Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Sep 21:arXiv:2307.09683v3.

PubMed and Beyond: Biomedical Literature Search in the Age of Artificial Intelligence

Affiliations

PubMed and Beyond: Biomedical Literature Search in the Age of Artificial Intelligence

Qiao Jin et al. ArXiv. .

Update in

Abstract

Biomedical research yields a wealth of information, much of which is only accessible through the literature. Consequently, literature search is an essential tool for building on prior knowledge in clinical and biomedical research. Although recent improvements in artificial intelligence have expanded functionality beyond keyword-based search, these advances may be unfamiliar to clinicians and researchers. In response, we present a survey of literature search tools tailored to both general and specific information needs in biomedicine, with the objective of helping readers efficiently fulfill their information needs. We first examine the widely used PubMed search engine, discussing recent improvements and continued challenges. We then describe literature search tools catering to five specific information needs: 1. Identifying high-quality clinical research for evidence-based medicine. 2. Retrieving gene-related information for precision medicine and genomics. 3. Searching by meaning, including natural language questions. 4. Locating related articles with literature recommendation. 5. Mining literature to discover associations between concepts such as diseases and genetic variants. Additionally, we cover practical considerations and best practices for choosing and using these tools. Finally, we provide a perspective on the future of literature search engines, considering recent breakthroughs in large language models such as ChatGPT. In summary, our survey provides a comprehensive view of biomedical literature search functionalities with 36 publicly available tools.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Overview of five specialized search scenarios in biomedicine: evidence-based medicine, precision medicine & genomics, semantic search, literature recommendation, and literature mining. Each search scenario is characterized by its unique input interface, search or ranking algorithm, and output display.
Figure 2.
Figure 2.
The architecture of a search engine for evidence-based medicine (EBM). EBM search engines should incorporate PICO elements (Population, Intervention, Comparison, and Outcome) within the input query and rank the articles returned based on the quality of the evidence.
Figure 3.
Figure 3.
Illustration of the functionality of a search engine for precision medicine and genomics. Search engines for precision medicine and genomics should handle queries containing genomic variants and identify all synonymous references to these variants in the literature.
Figure 4.
Figure 4.
Depiction of semantic search. Unlike traditional keyword-based search engines, semantic search engines process words and phrases according to their meaning rather than the literal text. For instance, “heart attack”, “AMI”, and “myocardial infarction” share similar meanings.
Figure 5.
Figure 5.
Illustration of topic-based and article-based literature recommendation systems. Topic-based systems provide articles relevant to a specific topic, while article-based systems return articles similar to a group of initial (seed) articles and dissimilar to a group of irrelevant articles.
Figure 6.
Figure 6.
The architecture of a system for mining entity associations from the biomedical literature. The system retrieves articles relevant to a given query, extracts biomedical entities and their relationships (e.g., variant-causing-disease), and presents the search as a knowledge graph that visualizes the extracted entities and their relationships.

References

    1. Baumgartner W. A. Jr., Cohen K. B., Fox L. M., Acquaah-Mensah G. & Hunter L. Manual curation is not sufficient for annotation of genomic databases. Bioinformatics 23, i41–48, doi: 10.1093/bioinformatics/btm229 (2007). - DOI - PMC - PubMed
    1. Ely J. W., Osheroff J. A., Chambliss M. L., Ebell M. H. & Rosenbaum M. E. Answering physicians’ clinical questions: obstacles and potential solutions. J Am Med Inform Assoc 12, 217–224, doi: 10.1197/jamia.M1608 (2005). - DOI - PMC - PubMed
    1. Gopalakrishnan V., Jha K., Jin W. & Zhang A. A survey on literature based discovery approaches in biomedical domain. J Biomed Inform 93, 103141, doi: 10.1016/j.jbi.2019.103141 (2019). - DOI
    1. Islamaj Dogan R., Murray G. C., Neveol A. & Lu Z. Understanding PubMed user search behavior through log analysis. Database (Oxford) 2009, bap018, doi: 10.1093/database/bap018 (2009). - DOI
    1. Fiorini N., Leaman R., Lipman D. J. & Lu Z. How user intelligence is improving PubMed. Nat Biotechnol, doi: 10.1038/nbt.4267 (2018). - DOI

Publication types

LinkOut - more resources