Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 4;22(1):482.
doi: 10.1186/s12859-021-04397-w.

SicknessMiner: a deep-learning-driven text-mining tool to abridge disease-disease associations

Affiliations

SicknessMiner: a deep-learning-driven text-mining tool to abridge disease-disease associations

Nícia Rosário-Ferreira et al. BMC Bioinformatics. .

Abstract

Background: Blood cancers (BCs) are responsible for over 720 K yearly deaths worldwide. Their prevalence and mortality-rate uphold the relevance of research related to BCs. Despite the availability of different resources establishing Disease-Disease Associations (DDAs), the knowledge is scattered and not accessible in a straightforward way to the scientific community. Here, we propose SicknessMiner, a biomedical Text-Mining (TM) approach towards the centralization of DDAs. Our methodology encompasses Named Entity Recognition (NER) and Named Entity Normalization (NEN) steps, and the DDAs retrieved were compared to the DisGeNET resource for qualitative and quantitative comparison.

Results: We obtained the DDAs via co-mention using our SicknessMiner or gene- or variant-disease similarity on DisGeNET. SicknessMiner was able to retrieve around 92% of the DisGeNET results and nearly 15% of the SicknessMiner results were specific to our pipeline.

Conclusions: SicknessMiner is a valuable tool to extract disease-disease relationship from RAW input corpus.

Keywords: Biomedical text-mining; Blood cancers; Deep learning; Disease-disease associations; Natural language processing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
SicknessMiner pipeline: a TM approach for DDAs. SicknessMiner is a two-step pipeline integrating subsequent modules for NER and NEN. First, from the RAW text input, an entity's list is plotted according to co-mentions with more than 1 BC type. To evaluate SicknessMiner, we used the BC5CDR evaluation kit. Finally, DDAs were doubly assessed via SicknessMiner and DisGeNET and further evaluation was performed for a better understanding of key improvements obtained herein
Fig. 2
Fig. 2
SicknessMiner Top 100 co-mentioned entries (in this case all entries are related to the 3 BCs since the query was combined)
Fig. 3
Fig. 3
DisGeNET Top100 entries that are correlated with 2 or 3 BCs and share, at least, 20 genes or variants between diseases
Fig. 4
Fig. 4
SicknessMiner and DisGeNET comparison

References

    1. Batool Z, Usman M, Saleem K, Abdullah-Al-Wadud M, Fazal-e-Amin A-E. Disease–disease association using network modeling: challenges and opportunities. J Med Imaging Health Inform. 2018;8(4):627–638. doi: 10.1166/jmihi.2018.2342. - DOI
    1. Opap K, Mulder N. Recent advances in predicting gene-disease associations. F1000Res. 2017;6:578. doi: 10.12688/f1000research.10788.1. - DOI - PMC - PubMed
    1. Bello SM, Shimoyama M, Mitraka E, Laulederkind SJF, Smith CL, Eppig JT, et al. Disease ontology: improving and unifying disease annotations across species. Dis Model Mech. 2018 doi: 10.1242/dmm.032839. - DOI - PMC - PubMed
    1. Piñero J, Ramírez-Anguita JM, Saüch-Pitarch J, Ronzano F, Centeno E, Sanz F, et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 2020;48(D1):D845–D855. - PMC - PubMed
    1. Zhu F, Patumcharoenpol P, Zhang C, Yang Y, Chan J, Meechai A, et al. Biomedical text mining and its applications in cancer research. J Biomed Inform. 2013;46(2):200–211. doi: 10.1016/j.jbi.2012.10.007. - DOI - PubMed

LinkOut - more resources