Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004;107(Pt 1):555-9.

Automating terminological networks to link heterogeneous biomedical databases

Affiliations

Automating terminological networks to link heterogeneous biomedical databases

Xiaoyan Wang et al. Stud Health Technol Inform. 2004.

Abstract

As cross-disciplinary research escalates, researchers are facing the challenge of linking disparate biomedical databases that have been developed without common indexes. Manually indexing these large-scale databases is laborious and often impractical. Solutions involving mediating terminologies have been proposed, but coordination of terms from the databases of interest to these mediating terminologies is also laborious, and regular synchronization between indexes is an additional problem. In this study we describe a novel method of linking heterogeneous databases using terminology networks constructed with automated mapping methods. Linkage was established between two disparate biomedical databases (SNOMED-CT and HDG), using two relevant intermediating databases (UMLS and OMIM). One gold standard of 514 distinct matches is used as proof-of-principle. In conclusion, as hypothesized, 1) Manually curated pathways provide high precision, but offer low recall, 2) the automated terminology pathways can significantly increase recall at acceptable precision. Taken together, our conclusion may suggest the combined manual and automated terminology networks could offer recall and precision in an incremental manner

PubMed Disclaimer

Figures

Figure 1
Figure 1
The network created to link disparate databases
Figure 2
Figure 2
Precision versus recall of each of the linking paths in the network.

References

    1. Altman RB, Klein TE. Challenges for biomedical informatics and pharmacogenomics. Annual Review of Pharmacology & Toxicology. 2002;42:113–133. - PubMed
    1. Shortliffe E, editor. PL. Medical Informatics: Computer Applications in Health Care and Biomedicine. Springer; New York: 2001.
    1. Sujansky W. Heterogeneous database integration in biomedicine. Jnl Biomed Informatics. 2001;34:285–298. - PubMed
    1. Stead W, Miller R, Musen M, Hersh W. Integration and beyond: Linking information from disparate sources into workflow. JAMIA. 2000;7:135–45. - PMC - PubMed
    1. Mork P, Halevy A, Tarczay-Hornoch P. A model for data integration systems of biomedical data applied to online genetic databases. Proc AMIA Symp. 2001:473–7. - PMC - PubMed

Publication types