Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 6;15(10):2065-2074.
doi: 10.7150/ijbs.35743. eCollection 2019.

Trends in Alzheimer's Disease Research Based upon Machine Learning Analysis of PubMed Abstracts

Affiliations

Trends in Alzheimer's Disease Research Based upon Machine Learning Analysis of PubMed Abstracts

Renchu Guan et al. Int J Biol Sci. .

Abstract

About 29.8 million people worldwide had been diagnosed with Alzheimer's disease (AD) in 2015, and the number is projected to triple by 2050. In 2018, AD was the fifth leading cause of death in Americans with 65 years of age or older, but the progress of AD drug research is very limited. It is helpful to identify the key factors and research trends of AD for guiding further more effective studies. We proposed a framework named as LDAP, which combined the latent Dirichlet allocation model and affinity propagation algorithm to extract research topics from 95,876 AD-related papers published from 2007 to 2016. Trends and hotspots analyses were performed on LDAP results. We found that the focus points of AD research for the past 10 years include 15 diseases, 15 amino acids, peptides, and proteins, 9 enzymes and coenzymes, 7 hormones, 7 carbohydrates, 5 lipids, 2 organophosphonates, 18 chemicals, 11 compounds, 13 symptoms, and 20 phenomena. Our LDAP framework allowed us to trace the evolution of research trends and the most popular areas of interest (hotspots) on disease, protein, symptom, and phenomena. Meanwhile, 556 AD related-genes were identified, which are enriched in 12 KEGG pathways including the AD pathway and nitrogen metabolism pathway. Our results are freely available at https://www.keaml.cn/Alzheimer.

Keywords: Affinity Propagation; Alzheimer's disease; Latent Dirichlet Allocation.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interest exists.

Figures

Figure 1
Figure 1
Proposed framework. This framework consists of data processing, methods, and trends analyses. Data downloaded from PubMed was pre-processed and fed into an LDA model. AP algorithm was used to cluster the topics generated from LDA. Trend analyses were done on these topic words.
Figure 2
Figure 2
Paper counts on AD from 2007 to 2016. The number of papers on AD increased over time, and the total number of ten years is 95,876.
Figure 3
Figure 3
AD research topics of 2016. LDAP results contain 14 clusters and each cluster is represented by 20 words. Word cloud displays different size words. The more weight given a word, the more prominent it is in AD terminology.
Figure 4
Figure 4
Disease associations with AD in each year. Different colors represent different categories of diseases. Neurotoxicity, Stroke, and Diabetes belong to two categories of diseases at the same time.
Figure 5
Figure 5
AD emerges in 15 diseases hotspots. The arrow indicates different diseases and the word clouds are the hotspots of them.
Figure 6
Figure 6
Hotspot words on protein in each year. Different colors represent the frequencies of different kinds of proteins. The numbers in brace are the frequencies of proteins appeared.
Figure 7
Figure 7
Hotspot words on symptom and phenomena in each year. Different colors represent different categories of symptoms or phenomena.
Figure 8
Figure 8
Pathway of Alzheimer's disease. Our uploaded genes are marked by red stars. The red genes are labeled by KEGG, and they are in our results.
Figure 9
Figure 9
Key topics of AD from 2016 to 2018. (A) 2016 key topics. (B) 2017 key topics. (C) 2018 key topics. Different size of words displays the word's weight. The higher the word weight is, the bigger the word is.
Figure 10
Figure 10
Proportion of each category about the hotspots from 2007 to 2018. Different colors represent different categories. The height of each color indicates the proportion of that category.

References

    1. Brookmeyer R, Johnson E, Ziegler-Graham K, Arrighi HM. Forecasting the global burden of Alzheimer's disease. Alzheimer's & Dementia. 2007;3(3):186–91. - PubMed
    1. Prince MJ. World Alzheimer Report 2015: The Global Impact of Dementia. 2015. Revised 25 February 2019. https://www.alz.co.uk/research/world-report-2015.
    1. Vos T, Allen C, Arora M. et al. Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. The Lancet. 2016;388(10053):1545–1602. - PMC - PubMed
    1. Weuve J, Hebert LE, Scherr PA, Evans DA. Deaths in the United States among persons with Alzheimer's disease (2010-2050) Alzheimer's & Dementia. 2014;10(2):e40–46. - PMC - PubMed
    1. Bengt W, Philippe A, Sandrine A. et al. Defeating Alzheimer's disease and other dementias: a priority for European science and society. The Lancet Neurology. 2016;15(5):455–532. - PubMed

Publication types