Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 17;229(3):iyaf027.
doi: 10.1093/genetics/iyaf027.

The Unified Phenotype Ontology : a framework for cross-species integrative phenomics

Affiliations

The Unified Phenotype Ontology : a framework for cross-species integrative phenomics

Nicolas Matentzoglu et al. Genetics. .

Abstract

Phenotypic data are critical for understanding biological mechanisms and consequences of genomic variation, and are pivotal for clinical use cases such as disease diagnostics and treatment development. For over a century, vast quantities of phenotype data have been collected in many different contexts covering a variety of organisms. The emerging field of phenomics focuses on integrating and interpreting these data to inform biological hypotheses. A major impediment in phenomics is the wide range of distinct and disconnected approaches to recording the observable characteristics of an organism. Phenotype data are collected and curated using free text, single terms or combinations of terms, using multiple vocabularies, terminologies, or ontologies. Integrating these heterogeneous and often siloed data enables the application of biological knowledge both within and across species. Existing integration efforts are typically limited to mappings between pairs of terminologies; a generic knowledge representation that captures the full range of cross-species phenomics data is much needed. We have developed the Unified Phenotype Ontology (uPheno) framework, a community effort to provide an integration layer over domain-specific phenotype ontologies, as a single, unified, logical representation. uPheno comprises (1) a system for consistent computational definition of phenotype terms using ontology design patterns, maintained as a community library; (2) a hierarchical vocabulary of species-neutral phenotype terms under which their species-specific counterparts are grouped; and (3) mapping tables between species-specific ontologies. This harmonized representation supports use cases such as cross-species integration of genotype-phenotype associations from different organisms and cross-species informed variant prioritization.

Keywords: integration; ontology; phenotype; semantics.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest: The author(s) declare no conflicts of interest.

Figures

Fig. 1.
Fig. 1.
Distribution of entity types in the uPheno pattern library. All phenotype definitions reference at least one affected entity. The percentage of patterns using an entity type relative to all pattern templates is indicated. The main entity categories in uPheno phenotype pattern templates include: anatomical entity (UBERON:0001062), biological process (GO:0008150), cellular component (GO:0005575), chemical entity (CHEBI:24431), cell (CL:0000000), role (CHEBI:50906), behavior process (NBO:0000313), molecular function (GO:0003674), other entities (BFO:0000001).
Fig. 2.
Fig. 2.
Structure of the uPheno ontology. uPheno is a framework for consistent and logical definition of phenotype categories using ontology design patterns that provides a hierarchical vocabulary of species-neutral phenotype terms under which their species-specific counterparts are grouped. The ontology design templates are based on shared features of existing phenotypic descriptions from various model organisms and represent community consensus. The phenotype pattern template-adherent terms are adopted by species-specific ontologies, thereby contributing to the community-built uPheno framework. uPheno accelerates cross-species inference and computationally amenable comparative phenotype analysis. For example, the interoperable representation of heart phenotypes characterized by increased size, compared with wild-type in distinct species, such as zebrafish and humans, allows the cross-species identification of genes whose alleles can cause similar phenotypes. uPheno contextual hierarchy for increased size of the heart as displayed in the OLS.
Fig. 3.
Fig. 3.
Current degree of alignment of phenotype ontologies with uPheno. visualization used to quantify the degree of alignment of species-specific phenotype ontologies with uPheno patterns: Proportion of terms that follow a defined uPheno pattern (uPheno-conformant EQ); follow an EQ-style definition (EQ, not uPheno); and terms that do not have a logical definition (no EQ definition). Note that this visualization only quantifies automatically (pattern-based) term alignment and does not include terms aligned using manually defined mappings such as MP to HPO mappings from theMGI database.
Fig. 4.
Fig. 4.
DOSDP pattern for the representation of abnormal anatomical entity phenotypes. Species-specific phenotype ontologies implement this pattern in phenotype terms such as “Abnormality of the cardiovascular system” (HP:0001626) and “gall bladder quality, abnormal” (ZP:0006529).

Update of

  • The Unified Phenotype Ontology (uPheno): A framework for cross-species integrative phenomics.
    Matentzoglu N, Bello SM, Stefancsik R, Alghamdi SM, Anagnostopoulos AV, Balhoff JP, Balk MA, Bradford YM, Bridges Y, Callahan TJ, Caufield H, Cuzick A, Carmody LC, Caron AR, de Souza V, Engel SR, Fey P, Fisher M, Gehrke S, Grove C, Hansen P, Harris NL, Harris MA, Harris L, Ibrahim A, Jacobsen JOB, Köhler S, McMurry JA, Munoz-Fuentes V, Munoz-Torres MC, Parkinson H, Pendlington ZM, Pilgrim C, Robb SM, Robinson PN, Seager J, Segerdell E, Smedley D, Sollis E, Toro S, Vasilevsky N, Wood V, Haendel MA, Mungall CJ, McLaughlin JA, Osumi-Sutherland D. Matentzoglu N, et al. bioRxiv [Preprint]. 2024 Sep 22:2024.09.18.613276. doi: 10.1101/2024.09.18.613276. bioRxiv. 2024. Update in: Genetics. 2025 Mar 17;229(3):iyaf027. doi: 10.1093/genetics/iyaf027. PMID: 39345458 Free PMC article. Updated. Preprint.

References

    1. Alghamdi SM, Schofield PN, Hoehndorf R. 2022. Contribution of model organism phenotypes to the computational identification of human disease genes. Dis Model Mech. 15(7):dmm049441. doi:10.1242/dmm.049441 - DOI - PMC - PubMed
    1. Althagafi A, Zhapa-Camacho F, Hoehndorf R. 2024. Prioritizing genomic variants through neuro-symbolic, knowledge-enhanced learning. Bioinformatics. 40(5):btae301. doi:10.1093/bioinformatics/btae301 - DOI - PMC - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. . 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 25(1):25–29. doi:10.1038/75556 - DOI - PMC - PubMed
    1. Baldarelli RM, Smith CL, Ringwald M, Richardson JE, Bult CJ. 2024. Mouse Genome Informatics: an integrated knowledgebase system for the laboratory mouse. Genetics. 227(1):iyae031. doi:10.1093/genetics/iyae031 - DOI - PMC - PubMed
    1. Bradford YM, Van Slyke CE, Ruzicka L, Singer A, Eagle A, Fashena D, Howe DG, Frazer K, Martin R, Paddock H, et al. . 2022. Zebrafish information network, the knowledgebase for Danio rerio research. Genetics. 220(4):iyac016. doi:10.1093/genetics/iyac016 - DOI - PMC - PubMed