Emergence and evolution of big data science in HIV research: Bibliometric analysis of federally sponsored studies 2000-2019
- PMID: 34481301
- PMCID: PMC8529625
- DOI: 10.1016/j.ijmedinf.2021.104558
Emergence and evolution of big data science in HIV research: Bibliometric analysis of federally sponsored studies 2000-2019
Abstract
Background: The rapid growth of inherently complex and heterogeneous data in HIV/AIDS research underscores the importance of Big Data Science. Recently, there have been increasing uptakes of Big Data techniques in basic, clinical, and public health fields of HIV/AIDS research. However, no studies have systematically elaborated on the evolving applications of Big Data in HIV/AIDS research. We sought to explore the emergence and evolution of Big Data Science in HIV/AIDS-related publications that were funded by the US federal agencies.
Methods: We identified HIV/AIDS and Big Data related publications that were funded by seven federal agencies from 2000 to 2019 by integrating data from National Institutes of Health (NIH) ExPORTER, MEDLINE, and MeSH. Building on bibliometrics and Natural Language Processing (NLP) methods, we constructed co-occurrence networks using bibliographic metadata (e.g., countries, institutes, MeSH terms, and keywords) of the retrieved publications. We then detected clusters among the networks as well as the temporal dynamics of clusters, followed by expert evaluation and clinical implications.
Results: We harnessed nearly 600 thousand publications related to HIV/AIDS, of which 19,528 publications relating to Big Data were included in bibliometric analysis. Results showed that (1) the number of Big Data publications has been increasing since 2000, (2) US institutes have been in close collaborations with China, Canada, and Germany, (3) some institutes (e.g., University of California system, MD Anderson Cancer Center, and Harvard Medical School) are among the most productive institutes and started using Big Data in HIV/AIDS research early, (4) Big Data research was not active in public health disciplines until 2015, (5) research topics such as genomics, HIV comorbidities, population-based studies, Electronic Health Records (EHR), social media, precision medicine, and methodologies such as machine learning, Deep Learning, radiomics, and data mining emerge quickly in recent years.
Conclusions: We identified a rapid growth in the cross-disciplinary research of HIV/AIDS and Big Data over the past two decades. Our findings demonstrated patterns and trends of prevailing research topics and Big Data applications in HIV/AIDS research and suggested a number of fast-evolving areas of Big Data Science in HIV/AIDS research including secondary analysis of EHR, machine learning, Deep Learning, predictive analysis, and NLP.
Keywords: AIDS; Bibliometrics; Big data; Data mining; Electronic health records; HIV; PLWH.
Copyright © 2021 Elsevier B.V. All rights reserved.
Conflict of interest statement
Statement on conflicts of interest
None declared.
Figures





Similar articles
-
Big data research in nursing: A bibliometric exploration of themes and publications.J Nurs Scholarsh. 2024 May;56(3):466-477. doi: 10.1111/jnu.12954. Epub 2023 Dec 22. J Nurs Scholarsh. 2024. PMID: 38140780
-
Evaluating research and impact: a bibliometric analysis of research by the NIH/NIAID HIV/AIDS clinical trials networks.PLoS One. 2011 Mar 4;6(3):e17428. doi: 10.1371/journal.pone.0017428. PLoS One. 2011. PMID: 21394198 Free PMC article.
-
Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed.J Med Internet Res. 2020 Jan 23;22(1):e16816. doi: 10.2196/16816. J Med Internet Res. 2020. PMID: 32012074 Free PMC article. Review.
-
Using big data analytics to improve HIV medical care utilisation in South Carolina: A study protocol.BMJ Open. 2019 Jul 19;9(7):e027688. doi: 10.1136/bmjopen-2018-027688. BMJ Open. 2019. PMID: 31326931 Free PMC article.
-
Conversational Interfaces for Health: Bibliometric Analysis of Grants, Publications, and Patents.J Med Internet Res. 2019 Nov 18;21(11):e14672. doi: 10.2196/14672. J Med Internet Res. 2019. PMID: 31738171 Free PMC article. Review.
Cited by
-
Network analysis for estimating standardization trends in genomics using MEDLINE.BMC Med Res Methodol. 2022 Oct 7;22(1):263. doi: 10.1186/s12874-022-01740-4. BMC Med Res Methodol. 2022. PMID: 36207671 Free PMC article.
-
Big Data and Infectious Disease Epidemiology: Bibliometric Analysis and Research Agenda.Interact J Med Res. 2023 Mar 31;12:e42292. doi: 10.2196/42292. Interact J Med Res. 2023. PMID: 36913554 Free PMC article.
-
Global trends of big data analytics in health research: a bibliometric study.Front Med (Lausanne). 2025 Jul 1;12:1456286. doi: 10.3389/fmed.2025.1456286. eCollection 2025. Front Med (Lausanne). 2025. PMID: 40665984 Free PMC article.
-
A bibliometric review of predictive modelling for cervical cancer risk.Front Res Metr Anal. 2024 Nov 19;9:1493944. doi: 10.3389/frma.2024.1493944. eCollection 2024. Front Res Metr Anal. 2024. PMID: 39629021 Free PMC article. Review.
-
Research trends between acquired immune deficiency syndrome and hematological malignancies: a bibliometric analysis.Discov Oncol. 2025 Jul 21;16(1):1380. doi: 10.1007/s12672-025-03228-1. Discov Oncol. 2025. PMID: 40690075 Free PMC article.
References
-
- National Institutes of Health, NIH Strategic Plan for Data Science, (2018).
-
- Murdoch TB, Detsky AS, The inevitable application of big data to health care, Jama. 309 (2013) 1351–1352. - PubMed
-
- Rana AI, Mugavero MJ, How big data science can improve linkage and retention in care, Infect. Dis. Clin. North Am 33 (2019) 807–815. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous