Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan;149(1):369-378.
doi: 10.1016/j.jaci.2021.04.033. Epub 2021 May 12.

Curation and expansion of Human Phenotype Ontology for defined groups of inborn errors of immunity

Affiliations

Curation and expansion of Human Phenotype Ontology for defined groups of inborn errors of immunity

Matthias Haimel et al. J Allergy Clin Immunol. 2022 Jan.

Abstract

Background: Accurate, detailed, and standardized phenotypic descriptions are essential to support diagnostic interpretation of genetic variants and to discover new diseases. The Human Phenotype Ontology (HPO), extensively used in rare disease research, provides a rich collection of vocabulary with standardized phenotypic descriptions in a hierarchical structure. However, to date, the use of HPO has not yet been widely implemented in the field of inborn errors of immunity (IEIs), mainly due to a lack of comprehensive IEI-related terms.

Objectives: We sought to systematically review available terms in HPO for the depiction of IEIs, to expand HPO, yielding more comprehensive sets of terms, and to reannotate IEIs with HPO terms to provide accurate, standardized phenotypic descriptions.

Methods: We initiated a collaboration involving expert clinicians, geneticists, researchers working on IEIs, and bioinformaticians. Multiple branches of the HPO tree were restructured and extended on the basis of expert review. Our ontology-guided machine learning coupled with a 2-tier expert review was applied to reannotate defined subgroups of IEIs.

Results: We revised and expanded 4 main branches of the HPO tree. Here, we reannotated 73 diseases from 4 International Union of Immunological Societies-defined IEI disease subgroups with HPO terms. We achieved a 4.7-fold increase in the number of phenotypic terms per disease. Given the new HPO annotations, we demonstrated improved ability to computationally match selected IEI cases to their known diagnosis, and improved phenotype-driven disease classification.

Conclusions: Our targeted expansion and reannotation presents enhanced precision of disease annotation, will enable superior HPO-based IEI characterization, and hence benefit both IEI diagnostic and research activities.

Keywords: HPO; diagnostic support; disease classification; genetic analysis; immunodeficiencies; inborn errors of immunity; ontology; patient matching; phenotype; rare diseases.

PubMed Disclaimer

Conflict of interest statement

Disclosure of potential conflict of interest: The authors declare that they have no relevant conflicts of interest.

Figures

FIG 1.
FIG 1.
Pipeline for standardized reannotation of IEI diseases. First, scientific publications were collected by experts for each disease within the subgroups. Second, HPO terms were extracted from the provided publications for each disease using machine learning and summarized into Excel documents. Third, a 2-tier expert review evaluated the text-mined terms, suggested additional terms if required, and the responsible working group agreed on the final HPO annotations for each disease. Fourth, data were collated, and the agreed terms were submitted to HPO.
FIG 2.
FIG 2.
Revision and expansion of the HPO tree. A, Schematic representation of the restructuring of the HPO tree. Main branches of the HPO tree where restructuring was performed are marked with light green. B, “Abnormality of temperature,” “Abnormality of immunoglobulin level,” and “Unusual infections” as examples of revised branches of the HPO tree. New additions and suggestions are marked with green, and repositioned terms are marked with yellow.
FIG 3.
FIG 3.
Results of disease reannotation. A, HPO annotation availability in the subset of 72 diseases. B, Distribution of number of available HPO terms per disease. C, Pipeline for the reannotation process. D, Distribution of the number of articles used per disease for the reannotation pipeline. E, Number of mined terms per disease. Each dot represents a disease. F, All mined vs all accepted terms. G, Number of available terms per disease before and after reannotation. Each dot represents a disease. H, Mean information content available per disease before and after reannotation. I, The aggregate mean annotation per disease after reannotation. J, All text-mined terms from PAD publications. K, Frequency distribution of different PAD terms according to the experts.
FIG 4.
FIG 4.
Patient-disease matching. A, Schematic overview of the different steps of patient-to-disease matching. First, the phenotypes were identified in a patient’s clinical history. Second, these phenotypes were translated to HPO terms. Finally, patient phenotype to disease matching was measured by Lin similarity. B, Matching patient 1 to a diagnosis. C, Similarity of patients in patient cohort to genetic diagnosis before and after reannotation. D, The rank of correct clinical diagnosis more often is in the top 10 of matched diseases after reannotation. E, Improvement of ranks of clinical diagnosis before and after reannotation. Significance was assessed by Student t test.
FIG 5.
FIG 5.
Phenotypic similarity of diseases before and after reannotation. Diseases are annotated with the IUIS disease group (inner circle), subgroup (outer circle), and OMIM identifier. A, Clustering of diseases based on phenotypic similarity before reannotation. B, Clustering of diseases based on phenotypic similarity after reannotation. OMIM, Online Mendelian Inheritance in Men; SCID, severe combined immunodeficiency.

Similar articles

Cited by

References

    1. Gahl WA, Markello TC, Toro C, Fajardo KF, Sincan M, Gill F, et al. The National Institutes of Health Undiagnosed Diseases Program: insights into rare diseases. Genet Med 2012;14:51–9. - PMC - PubMed
    1. Philippakis AA, Azzariti DR, Beltran S, Brookes AJ, Brownstein CA, Brudno M, et al. The Matchmaker Exchange: a platform for rare disease gene discovery. Hum Mutat 2015;36:915–21. - PMC - PubMed
    1. Sobreira N, Schiettecatte F, Valle D, Hamosh A. GeneMatcher: a matching tool for connecting investigators with an interest in the same gene. Hum Mutat 2015;36:928–30. - PMC - PubMed
    1. Hernandez-Ibarburu G, Perez-Rey D, Alonso-Oset E, Alonso-Calvo R, de Schepper K, Meloni L, et al. ICD-10-CM extension with ICD-9 diagnosis codes to support integrated access to clinical legacy data. Int J Med Inform 2019;129:189–97. - PubMed
    1. Amberger J, Bocchini C, Hamosh A. A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®). Hum Mutat 2011;32:564–7. - PubMed

Publication types

MeSH terms