Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed
- PMID: 32012074
- PMCID: PMC7005695
- DOI: 10.2196/16816
Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed
Abstract
Background: Natural language processing (NLP) is an important traditional field in computer science, but its application in medical research has faced many challenges. With the extensive digitalization of medical information globally and increasing importance of understanding and mining big data in the medical field, NLP is becoming more crucial.
Objective: The goal of the research was to perform a systematic review on the use of NLP in medical research with the aim of understanding the global progress on NLP research outcomes, content, methods, and study groups involved.
Methods: A systematic review was conducted using the PubMed database as a search platform. All published studies on the application of NLP in medicine (except biomedicine) during the 20 years between 1999 and 2018 were retrieved. The data obtained from these published studies were cleaned and structured. Excel (Microsoft Corp) and VOSviewer (Nees Jan van Eck and Ludo Waltman) were used to perform bibliometric analysis of publication trends, author orders, countries, institutions, collaboration relationships, research hot spots, diseases studied, and research methods.
Results: A total of 3498 articles were obtained during initial screening, and 2336 articles were found to meet the study criteria after manual screening. The number of publications increased every year, with a significant growth after 2012 (number of publications ranged from 148 to a maximum of 302 annually). The United States has occupied the leading position since the inception of the field, with the largest number of articles published. The United States contributed to 63.01% (1472/2336) of all publications, followed by France (5.44%, 127/2336) and the United Kingdom (3.51%, 82/2336). The author with the largest number of articles published was Hongfang Liu (70), while Stéphane Meystre (17) and Hua Xu (33) published the largest number of articles as the first and corresponding authors. Among the first author's affiliation institution, Columbia University published the largest number of articles, accounting for 4.54% (106/2336) of the total. Specifically, approximately one-fifth (17.68%, 413/2336) of the articles involved research on specific diseases, and the subject areas primarily focused on mental illness (16.46%, 68/413), breast cancer (5.81%, 24/413), and pneumonia (4.12%, 17/413).
Conclusions: NLP is in a period of robust development in the medical field, with an average of approximately 100 publications annually. Electronic medical records were the most used research materials, but social media such as Twitter have become important research materials since 2015. Cancer (24.94%, 103/413) was the most common subject area in NLP-assisted medical research on diseases, with breast cancers (23.30%, 24/103) and lung cancers (14.56%, 15/103) accounting for the highest proportions of studies. Columbia University and the talents trained therein were the most active and prolific research forces on NLP in the medical field.
Keywords: clinical; electronic medical record; information extraction; medicine; natural language processing.
©Jing Wang, Huan Deng, Bangtao Liu, Anbin Hu, Jun Liang, Lingye Fan, Xu Zheng, Tong Wang, Jianbo Lei. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 23.01.2020.
Conflict of interest statement
Conflicts of Interest: None declared.
Figures









Similar articles
-
A bibliometric analysis of natural language processing in medical research.BMC Med Inform Decis Mak. 2018 Mar 22;18(Suppl 1):14. doi: 10.1186/s12911-018-0594-x. BMC Med Inform Decis Mak. 2018. PMID: 29589569 Free PMC article.
-
Knowledge mapping of syringomyelia from 2003 to 2022: A bibliometric analysis.J Clin Neurosci. 2023 Apr;110:63-70. doi: 10.1016/j.jocn.2023.01.004. Epub 2023 Feb 21. J Clin Neurosci. 2023. PMID: 36822071 Review.
-
Natural language processing (NLP) to facilitate abstract review in medical research: the application of BioBERT to exploring the 20-year use of NLP in medical research.Syst Rev. 2024 Apr 15;13(1):107. doi: 10.1186/s13643-024-02470-y. Syst Rev. 2024. PMID: 38622611 Free PMC article.
-
Trends and characteristics of global medical informatics conferences from 2007 to 2017: A bibliometric comparison of conference publications from Chinese, American, European and the Global Conferences.Comput Methods Programs Biomed. 2018 Nov;166:19-32. doi: 10.1016/j.cmpb.2018.08.017. Epub 2018 Aug 27. Comput Methods Programs Biomed. 2018. PMID: 30415715
-
Deep learning in clinical natural language processing: a methodical review.J Am Med Inform Assoc. 2020 Mar 1;27(3):457-470. doi: 10.1093/jamia/ocz200. J Am Med Inform Assoc. 2020. PMID: 31794016 Free PMC article. Review.
Cited by
-
Biosimilars in the Era of Artificial Intelligence-International Regulations and the Use in Oncological Treatments.Pharmaceuticals (Basel). 2024 Jul 10;17(7):925. doi: 10.3390/ph17070925. Pharmaceuticals (Basel). 2024. PMID: 39065775 Free PMC article. Review.
-
Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z. BMC Med Inform Decis Mak. 2022. PMID: 35321705 Free PMC article.
-
Evolutionary Overview of Consumer Health Informatics: Bibliometric Study on the Web of Science from 1999 to 2019.J Med Internet Res. 2021 Sep 9;23(9):e21974. doi: 10.2196/21974. J Med Internet Res. 2021. PMID: 34499042 Free PMC article. Review.
-
Approaches Based on Artificial Intelligence and the Internet of Intelligent Things to Prevent the Spread of COVID-19: Scoping Review.J Med Internet Res. 2020 Aug 10;22(8):e19104. doi: 10.2196/19104. J Med Internet Res. 2020. PMID: 32584780 Free PMC article.
-
Artificial intelligence: revolutionizing cardiology with large language models.Eur Heart J. 2024 Feb 1;45(5):332-345. doi: 10.1093/eurheartj/ehad838. Eur Heart J. 2024. PMID: 38170821 Free PMC article.
References
-
- Cambria E, White B. Jumping NLP curves: a review of natural language processing research [review article] IEEE Comput Intell Mag. 2014 May;9(2):48–57. doi: 10.1109/mci.2014.2307227. - DOI
-
- Liddy E. Natural language processing. Scripting Intelligence. 2001;10(1):450–461. doi: 10.1007/978-1-4302-2352-8_3. - DOI
-
- Weaver W. Translation. In: Locke WN, Booth AD, editors. Machine Translation of Languages. Cambridge: MIT Press; 1955. pp. 15–23.
-
- Dobrow MJ, Bytautas JP, Tharmalingam S, Hagens S. Interoperable electronic health records and health information exchanges: systematic review. JMIR Med Inform. 2019 Jun 06;7(2):e12607. doi: 10.2196/12607. https://medinform.jmir.org/2019/2/e12607/ - DOI - PMC - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous