Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jul 1;39(7):btad440.
doi: 10.1093/bioinformatics/btad440.

RDBridge: a knowledge graph of rare diseases based on large-scale text mining

Affiliations

RDBridge: a knowledge graph of rare diseases based on large-scale text mining

Huadong Xing et al. Bioinformatics. .

Abstract

Motivation: Despite low prevalence, rare diseases affect 300 million people worldwide. Research on pathogenesis and drug development lags due to limited commercial potential, insufficient epidemiological data, and a dearth of publications. The unique characteristics of rare diseases, including limited annotated data, intricate processes for extracting pertinent entity relationships, and difficulties in standardizing data, represent challenges for text mining.

Results: We developed a rare disease data acquisition framework using text mining and knowledge graphs and constructed the most comprehensive rare disease knowledge graph to date, Rare Disease Bridge (RDBridge). RDBridge offers search functions for genes, potential drugs, pathways, literature, and medical imaging data that will support mechanistic research, drug development, diagnosis, and treatment for rare diseases.

Availability and implementation: RDBridge is freely available at http://rdb.lifesynther.com/.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
Framework of RDBridge. The workflow comprises a comprehensive series of stages designed to systematically process scientific documents to extract rare disease information. These stages encompass several critical processes, including document downloading, pre-processing, named entity recognition, relation extraction, data matching, binary classification model training, model retraining, knowledge discovery, and the use of web servers.

References

    1. Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Future Healthc J 2019;6:94–8. - PMC - PubMed
    1. Feldman EL, Goutman SA, Petri S et al. Amyotrophic lateral sclerosis. Lancet 2022;400:1363–80. - PMC - PubMed
    1. Gupta H, Badapanda C, Ghosh A et al. RareDDB: an integrated catalog of rare disease database. Clin Med Biochem 2016;2:2.
    1. Haendel M, Vasilevsky N, Unni D et al. How many rare diseases are there? Nat Rev Drug Discov 2020;19:77–8. - PMC - PubMed
    1. Halley MC, Smith HS, Ashley EA et al. A call for an integrated approach to improve efficiency, equity and sustainability in rare disease research in the United States. Nat Genet 2022;54:219–22. - PMC - PubMed

Publication types