Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 8;20(1):70.
doi: 10.1186/s12859-019-2634-7.

Analyzing a co-occurrence gene-interaction network to identify disease-gene association

Affiliations

Analyzing a co-occurrence gene-interaction network to identify disease-gene association

Amira Al-Aamri et al. BMC Bioinformatics. .

Abstract

Background: Understanding the genetic networks and their role in chronic diseases (e.g., cancer) is one of the important objectives of biological researchers. In this work, we present a text mining system that constructs a gene-gene-interaction network for the entire human genome and then performs network analysis to identify disease-related genes. We recognize the interacting genes based on their co-occurrence frequency within the biomedical literature and by employing linear and non-linear rare-event classification models. We analyze the constructed network of genes by using different network centrality measures to decide on the importance of each gene. Specifically, we apply betweenness, closeness, eigenvector, and degree centrality metrics to rank the central genes of the network and to identify possible cancer-related genes.

Results: We evaluated the top 15 ranked genes for different cancer types (i.e., Prostate, Breast, and Lung Cancer). The average precisions for identifying breast, prostate, and lung cancer genes vary between 80-100%. On a prostate case study, the system predicted an average of 80% prostate-related genes.

Conclusions: The results show that our system has the potential for improving the prediction accuracy of identifying gene-gene interaction and disease-gene associations. We also conduct a prostate cancer case study by using the threshold property in logistic regression, and we compare our approach with some of the state-of-the-art methods.

Keywords: Biological NLP; Biomedical literature; Disease-gene association; Genetic network; Text mining.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Number of new cases and deaths for each common cancer type from NIH [2]
Fig. 2
Fig. 2
ROC curve for Training the data using WLR. TPR is increased at low FPR
Fig. 3
Fig. 3
Precision-Recall Curve Using WLR
Fig. 4
Fig. 4
Precision-Recall Curve Using WKLR
Fig. 5
Fig. 5
The process of network analysis and disease-gene identification
Fig. 6
Fig. 6
The prediction is made over several thresholds. As the threshold increases, fewer pairs are assigned to the positive class

References

    1. Centers for Disease Control and Prevention. Leading causes of death and numbers of deaths, by sex, race, and Hispanic origin: United States, 1980 and 2014 (Table 19). Health, United States, 2015. https://www.cdc.gov/nchs/data/hus/hus15.pdf. Accessed 22 Sept 2017.
    1. National Cancer Institute at the National Institutes of Health. Common Cancer Types. Atlanta; 2016. https://www.cancer.gov/types/common-cancers. Accessed 23 Aug 2017.
    1. American Cancer Society: Cancer Facts and Figures 2017. Atlanta American Cancer Society; 2017. https://www.cancer.org/research/cancer-facts-statistics/all-cancer-facts.... Accessed 23 Aug 2017.
    1. Mohammad RM, Muqbil I, Lowe L, Yedjou C, Hsu H-Y, Lin L-T, Siegelin MD, Fimognari C, Kumar NB, Dou QP, et al. Seminars in Cancer Biology. United States: Elsevier; 2015. Broad targeting of resistance to apoptosis in cancer. - PMC - PubMed
    1. Feitelson MA, Arzumanyan A, Kulathinal RJ, Blain SW, Holcombe RF, Mahajna J, Marino M, Martinez-Chantar ML, Nawroth R, Sanchez-Garcia I, et al. Seminars in Cancer Biology. United States: Elsevier; 2015. Sustained proliferation in cancer: Mechanisms and novel therapeutic targets. - PMC - PubMed

Grants and funding

LinkOut - more resources