Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 14:10:70.
doi: 10.3389/fgene.2019.00070. eCollection 2019.

Global Genetics Research in Prostate Cancer: A Text Mining and Computational Network Theory Approach

Affiliations

Global Genetics Research in Prostate Cancer: A Text Mining and Computational Network Theory Approach

Md Facihul Azam et al. Front Genet. .

Abstract

Prostate cancer is the most common cancer type in men in Finland and second worldwide. In this paper, we analyze almost 150, 000 published papers about prostate cancer, authored by ten thousands of scientists worldwide, with an integrated text mining and computational network theory approach. We demonstrate how to integrate text mining with network analysis investigating research contributions of countries and collaborations within and between countries. Furthermore, we study the time evolution of individually and collectively studied genes. Finally, we investigate a collaboration network of Finland and compare studied genes with globally studied genes in prostate cancer genetics. Overall, our results provide a global overview of prostate cancer research in genetics. In addition, we present a specific discussion for Finland. Our results shed light on trends within the last 30 years and are useful for translational researchers within the full range from genetics to public health management and health policy.

Keywords: biomedical text mining; computational network theory; genetics; meta-analysis; natural language processing; network science; prostate cancer; text mining.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) Entity recognition step using the becas web API giving an annotation of PubMed abstracts (Nunes et al., 2013). (B) Data collection steps and the corresponding publications found.
Figure 2
Figure 2
Shown is the process how to identify the country associated with an author.
Figure 3
Figure 3
The number of published articles about prostate cancer in the world and prostate cancer statistics in Finland. The blue curve shows the number of published articles about prostate cancer in genetics and the violet curve shows the number of published articles about prostate cancer outside genetics. In red the mortality and in green the incidence rate per 100, 000 is shown for Finland.
Figure 4
Figure 4
(A) Country-specific contributions to prostate cancer research from 12 countries. Dark blue corresponds to results for prostate cancer research in genetics and light blue corresponds to results for general prostate cancer research. The percentages in blue give the country-specific contributions to genetics research. (B) Trends of genetics research from 1987 to 2018 showing the percentage of genetics research. The time intervals correspond to A: 1987–1996, B: 1997–2006, and C: 2007–2018.
Figure 5
Figure 5
(A) Summary of worldwide collaborations between countries for publications about prostate cancer research in genetics. (B–J) Pairwise collaborations between the top nine countries. Self-collaborations give the percentage of publications with all authors from the same country.
Figure 6
Figure 6
Collaborations between Finland and top 10 countries in prostate cancer research. (A,B) show results about prostate cancer research in general and (C,D) about prostate cancer research in genetics.
Figure 7
Figure 7
Shown are results for the 19 most frequently studied genes in prostate cancer research from 1987 to 2017. (A) Genes studied worldwide, (B) Genes studied in Finland.
Figure 8
Figure 8
Literature based gene-gene network for Finland. (A) Twenty most frequently studied genes in Finland. (B) Community network resulting from all studied genes in Finland.
Figure 9
Figure 9
Collaboration network of Finland for genetics research in prostate cancer. Nodes correspond to cities and edges correspond to jointly published articles. The numbers provide information about the number of collaborations between cities (white background) and self-collaborations within cities (yellow background).
Figure 10
Figure 10
Gene set enrichment analysis for the three cities in Finland that publish most articles in prostate cancer research (see Table 1). We apply a Bonferroni correction because we are testing 19 hypothesis (one hypothesis for one threshold, γ) simultaneously.
Figure 11
Figure 11
(A) Hierarchical clustering of cities in Finland showing their similarity of research interests in prostate cancer related genes. The results are based on the top 19 genes. (B) Proportions of mentioning of the top 19 genes in publications by the cities of Finland.

References

    1. Altshuler D., Daly M. J., Lander E. S. (2008). Genetic mapping in human disease. Science 322, 881–888. 10.1126/science.1156409 - DOI - PMC - PubMed
    1. American Cancer Society (2018a). Cancer Facts & Figures. Available online at: http://www.cancer.org/research/cancerfactsstatistics/cancerfactsfigures2018 (Accessed March 1, 2018).
    1. American Cancer Society (2018b). What Causes Cancer. Available online at: http://www.cancer.org/cancer (Accessed March 1, 2018).
    1. Barabási A. L. (2007). Network medicine – from obesity to the ‘Diseasome’. N. Engl. J. Med. 357, 404–407. 10.1056/NEJMe078114 - DOI - PubMed
    1. Botstein D., Risch N. (2003). Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat. Genet. 33:228. 10.1038/ng1090 - DOI - PubMed

LinkOut - more resources