A detailed open access model of the PubMed literature
- PMID: 33219227
- PMCID: PMC7680135
- DOI: 10.1038/s41597-020-00749-y
A detailed open access model of the PubMed literature
Abstract
Portfolio analysis is a fundamental practice of organizational leadership and is a necessary precursor of strategic planning. Successful application requires a highly detailed model of research options. We have constructed a model, the first of its kind, that accurately characterizes these options for the biomedical literature. The model comprises over 18 million PubMed documents from 1996-2019. Document relatedness was measured using a hybrid citation analysis + text similarity approach. The resulting 606.6 million document-to-document links were used to create 28,743 document clusters and an associated visual map. Clusters are characterized using metadata (e.g., phrases, MeSH) and over 20 indicators (e.g., funding, patent activity). The map and cluster-level data are embedded in Tableau to provide an interactive model enabling in-depth exploration of a research portfolio. Two example usage cases are provided, one to identify specific research opportunities related to coronavirus, and the second to identify research strengths of a large cohort of African American and Native American researchers at the University of Michigan Medical School.
Conflict of interest statement
Two of the authors (K.W.B. and R.K.) are employed by a small company that received the award mentioned above under which this work was funded. The authors declare no competing interests.
Figures






References
-
- Klavans R, Boyack KW. Research portfolio analysis and topic prominence. Journal of Informetrics. 2017;11:1158–1174. doi: 10.1016/j.joi.2017.10.002. - DOI
-
- Klavans R, Boyack KW. Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge? Journal of the Association for Information Science and Technology. 2017;68:984–998. doi: 10.1002/asi.23734. - DOI
-
- Ahlgren P, Chen Y, Colliander C, van Eck NJ. Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications. Quantitative Science Studies. 2020;1:714–729. doi: 10.1162/qss_a_00027. - DOI
-
- Waltman L, Boyack KW, Colavizza G, Van Eck NJ. A principled methodology for comparing relatedness measures for clustering publications. Quantitative Science Studies. 2020;1:691–713. doi: 10.1162/qss_a_00035. - DOI
-
- Baas J, Schotten M, Plume A, Côté G, Karimi R. Scopus as a curated, high-quality bibliometric data source for academic research in quantitative science studies. Quantitative Science Studies. 2020;1:377–386. doi: 10.1162/qss_a_00019. - DOI
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources