Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Apr 29;9 Suppl 5(Suppl 5):S3.
doi: 10.1186/1471-2105-9-S5-S3.

Mapping proteins to disease terminologies: from UniProt to MeSH

Affiliations

Mapping proteins to disease terminologies: from UniProt to MeSH

Anaïs Mottaz et al. BMC Bioinformatics. .

Abstract

Background: Although the UniProt KnowledgeBase is not a medical-oriented database, it contains information on more than 2,000 human proteins involved in pathologies. However, these annotations are not standardized, which impairs the interoperability between biological and clinical resources. In order to make these data easily accessible to clinical researchers, we have developed a procedure to link diseases described in the UniProtKB/Swiss-Prot entries to the MeSH disease terminology.

Results: We mapped disease names extracted either from the UniProtKB/Swiss-Prot entry comment lines or from the corresponding OMIM entry to the MeSH. Different methods were assessed on a benchmark set of 200 disease names manually mapped to MeSH terms. The performance of the retained procedure in term of precision and recall was 86% and 64% respectively. Using the same procedure, more than 3,000 disease names in Swiss-Prot were mapped to MeSH with comparable efficiency.

Conclusions: This study is a first attempt to link proteins in UniProtKB to the medical resources. The indexing we provided will help clinicians and researchers navigate from diseases to genes and from genes to diseases in an efficient way. The mapping is available at: http://research.isb-sib.ch/unimed.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Procedure of the mapping of UniProtKB/Swiss-Prot disease comment lines to MeSH terms.
Figure 2
Figure 2
Disease comment lines in a UniProtKB/Swiss-Prot entry.
Figure 3
Figure 3
Recall –precision curves for partial matches of Swiss-Prot disease names (A) and OMIM titles and alternative titles (B) to the disease MeSH terms, with term normalisation (blue squares), without normalisation (green empty squares), and with the method developed by Ha-Thuc (red triangles). The data have been ordered according to the score and the precision is calculated at increasing recall intervals.
Figure 4
Figure 4
F-measure in function of the score of partial matching to MeSH terms with Swiss-Prot disease names (blue triangles) or OMIM terms (red squares).

References

    1. The Universal Protein Resource (UniProt) Nucleic Acids Res. 2007;35:D193–D197. - PMC - PubMed
    1. Nelson SJ, Schopen M, Savage AG, Schulman JL, Arluk N. The MeSH Translation Maintenance System: Structure, Interface Design, and Implementation. Medinfo. 2004;11:67–69. - PubMed
    1. (The) ICD-10. Second Edition. WHO Press, Geneva; International Statistical Classification of Diseases and Health Related Problems.
    1. Donnelly K, SNOMED-CT The advanced terminology and coding system for eHealth. Stud Health Techno Inform. 2006;121:79–90. - PubMed
    1. Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32:D267–D270. - PMC - PubMed

Publication types

LinkOut - more resources