Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 18;7(4):ooae118.
doi: 10.1093/jamiaopen/ooae118. eCollection 2024 Dec.

Implications of mappings between International Classification of Diseases clinical diagnosis codes and Human Phenotype Ontology terms

Collaborators, Affiliations

Implications of mappings between International Classification of Diseases clinical diagnosis codes and Human Phenotype Ontology terms

Amelia L M Tan et al. JAMIA Open. .

Abstract

Objective: Integrating electronic health record (EHR) data with other resources is essential in rare disease research due to low disease prevalence. Such integration is dependent on the alignment of ontologies used for data annotation. The international classification of diseases (ICD) is used to annotate clinical diagnoses, while the human phenotype ontology (HPO) is used to annotate phenotypes. Although these ontologies overlap in the biomedical entities they describe, the extent to which they are interoperable is unknown. We investigate how well aligned these ontologies are and whether such alignments facilitate EHR data integration.

Materials and methods: We conducted an empirical analysis of the coverage of mappings between ICD and HPO. We interpret this mapping coverage as a proxy for how easily clinical data can be integrated with research ontologies such as HPO. We quantify how exhaustively ICD codes are mapped to HPO by analyzing mappings in the unified medical language system (UMLS) Metathesaurus. We analyze the proportion of ICD codes mapped to HPO within a real-world EHR dataset.

Results and discussion: Our analysis revealed that only 2.2% of ICD codes have direct mappings to HPO in UMLS. Within our EHR dataset, less than 50% of ICD codes have mappings to HPO terms. ICD codes that are used frequently in EHR data tend to have mappings to HPO; ICD codes that represent rarer medical conditions are seldom mapped.

Conclusion: We find that interoperability between ICD and HPO via UMLS is limited. While other mapping sources could be incorporated, there are no established conventions for what resources should be used to complement UMLS.

Keywords: data interoperability; ontology; ontology interoperability.

PubMed Disclaimer

Conflict of interest statement

R.G. owns shares, stocks, or consults for: 23andMe, Scorpion Tx, BioMap, Myia Labs, Pheno.AI, and intrECate.

Figures

Figure 1.
Figure 1.
The proportion of ICD/diagnosis codes that are matched to an HPO term, unmatched to any HPO terms or others (do not have a corresponding ICD code in the UMLS dictionary). The proportions were calculated for common codes that are used in >1% of the cohort, infrequently used codes that are attributed to 0.1%-1% of the cohort, and rare codes that are assigned to <0.1% of the patient cohort. These were calculated separately for the (A) admitted patients, (B) ICU patients, and (C) outpatients. HPO, human phenotype ontology; ICD, international classification of diseases; UMLS, unified medical language system.
Figure 2.
Figure 2.
Patient coverage of mapped and unmapped terms across different ICD categories. The left column specifies codes that are used in less than 100 patients, while the right column specifies codes that are used in more than 100 patients in our dataset. The figure depicts in yellow the frequency of usage of ICD codes that have an HPO mapping, in green the usage of ICD codes that do not have an HPO mapping, and in purple the codes in the “others” category which are in the ICD code list of our EHR dataset, but which did not match existing ICD10-CM codes. The numbers on the right of each column specify the number of times that codes from that category are used. HPO, human phenotype ontology; ICD, international classification of diseases.

References

    1. Garcelon N, Burgun A, Salomon R, et al.Electronic health records for the diagnosis of rare diseases. Kidney Int. 2020;97:676-686. - PubMed
    1. Köhler S, Gargano M, Matentzoglu N, et al.The human phenotype ontology in 2021. Nucleic Acids Res. 2021;49:D1207-D1217. - PMC - PubMed
    1. Organisation mondiale de la santé, World Health Organization, WHO. The ICD-10 Classification of Mental and Behavioural Disorders: Diagnostic Criteria for Research. World Health Organization; 1993.
    1. Lindberg DAB, Humphreys BL, McCray AT.. The unified medical language system. Yearb Med Inform. 1993;02:41-51. - PMC - PubMed
    1. Amos L, Anderson D, Brody S, et al.UMLS users and uses: a current overview. J Am Med Inform Assoc. 2020;27:1606-1611. 10.1093/jamia/ocaa084 - DOI - PMC - PubMed

LinkOut - more resources