Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Nov;22(6):1153-63.
doi: 10.1093/jamia/ocv157.

Trends in biomedical informatics: automated topic analysis of JAMIA articles

Affiliations

Trends in biomedical informatics: automated topic analysis of JAMIA articles

Dong Han et al. J Am Med Inform Assoc. 2015 Nov.

Abstract

Biomedical Informatics is a growing interdisciplinary field in which research topics and citation trends have been evolving rapidly in recent years. To analyze these data in a fast, reproducible manner, automation of certain processes is needed. JAMIA is a "generalist" journal for biomedical informatics. Its articles reflect the wide range of topics in informatics. In this study, we retrieved Medical Subject Headings (MeSH) terms and citations of JAMIA articles published between 2009 and 2014. We use tensors (i.e., multidimensional arrays) to represent the interaction among topics, time and citations, and applied tensor decomposition to automate the analysis. The trends represented by tensors were then carefully interpreted and the results were compared with previous findings based on manual topic analysis. A list of most cited JAMIA articles, their topics, and publication trends over recent years is presented. The analyses confirmed previous studies and showed that, from 2012 to 2014, the number of articles related to MeSH terms Methods, Organization & Administration, and Algorithms increased significantly both in number of publications and citations. Citation trends varied widely by topic, with Natural Language Processing having a large number of citations in particular years, and Medical Record Systems, Computerized remaining a very popular topic in all years.

Keywords: bioinformatics; biomedical informatics; citation analysis; medical subject headings; tensor factorization.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
An example of a three-dimensional tensor and its decomposition process using tensor factorization, where the tensor includes time (in 6-month periods), citations per month (CPM), and MeSH terms. The extracted tensor factors (e.g., Types 1, 2,…, R) provide a representation of the interaction among different factors, where the parameter λι for i = 1, 2,…, R denotes the weight of each decomposed tensor. A larger value of λι corresponds to a more important factor (λ1 is the highest value and determines the most important tensor factor, which is similar in interpretation to the first component in principal component analysis). The bias tensor factor captures the baseline characteristics of the articles. The weighted combination of these tensor factors (e.g., depicted by the + operators) can be used to approximate (e.g., depicted by the ≈ operator) the original tensor.
Figure 2:
Figure 2:
Breakdown of the percentage of articles published in each year (PAPY) in line plots (right vertical axis), as well as of the average number of citations per article (ACPA) in bar plots (left vertical axis), grouped by year in which citations are observed (x-axis). Subfigures (a) and (b–j) depict overall articles and articles in the nine most frequent topics, respectively. In subfigure (a), we define the PAPY* as (no. of articles published in a given year)/(total no. of articles published from 2009 to 2014). For PAPY in subfigures (b–j), it was defined as (no. of articles published under the given topic in a given year)/(total no. of articles published in the same year).
Figure 3:
Figure 3:
Histogram of articles with different citations per month (CPM) values ranging from 0 to 3.05 in the nine most frequent topics, where CPM values in a logarithm scale were evenly divided into 11 bins.
Figure 4:
Figure 4:
The top 5 MeSH terms with corresponding non-normalized weights based on 1417 MeSH terms, using time-guided tensor factorization where the top 10 MeSH terms in the bias tensor with their non-normalized weights are also provided as reference. The Bias tensor represents the baseline characteristics that are common across articles. It is worth mentioning that Electronic Health Records only entered the MeSH vocabulary in 2010. Previously, relevant articles would have been indexed under Medical Record Systems, Computerized. EHR: Electronic Health Records; MRS: Medical Record Systems, Computerized; NLP: Natural Language Processing; AI: Artificial Intelligence; ISR/Methods: Information Storage and Retrieval/Methods; ID: Information Dissemination; DSS: Decision Support Systems, Clinical; DM: Data Mining/Methods/Classification; MOES: Medical Order Entry Systems; MRL: Medical Record Linkage; PCC: Patient-Centered Care; PPP: Physician’s Practice Patterns; PP: Pharmaceutical Preparations; UCI: User-Computer Interaction; O&A: Organization & Administration.

References

    1. Maheswarappa BS. Bibliometrics: an overview. Bibliometr Stud. 1997;1:1.
    1. Kim H-E, Jiang X, Kim J, et al. Trends in biomedical informatics: most cited topics from recent years. JAMIA. 2011;18:i166–i170. - PMC - PubMed
    1. Jiang X, Tse K, Wang S, et al. Recent trends in biomedical informatics: a study based on JAMIA articles. JAMIA. 2013;20:e198–e205. - PMC - PubMed
    1. Kim H, Ohno-Machado L, Oh J, et al. Trends in publication of nursing informatics research. In: AMIA Annual Symposium Proceedings. 2014;805. - PMC - PubMed
    1. Ho JC, Ghosh J, Steinhubl SR, et al. Limestone: high-throughput candidate phenotype generation via tensor factorization. J Biomed Inform. 2014;52:199–211. - PMC - PubMed

Publication types