Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 May 6;10 Suppl 5(Suppl 5):S2.
doi: 10.1186/1471-2105-10-S5-S2.

Practical application of ontologies to annotate and analyse large scale raw mouse phenotype data

Affiliations

Practical application of ontologies to annotate and analyse large scale raw mouse phenotype data

Tim Beck et al. BMC Bioinformatics. .

Abstract

Background: Large-scale international projects are underway to generate collections of knockout mouse mutants and subsequently to perform high throughput phenotype assessments, raising new challenges for computational researchers due to the complexity and scale of the phenotype data. Phenotypes can be described using ontologies in two differing methodologies. Traditionally an individual phenotypic character has either been defined using a single compound term, originating from a species-specific dedicated phenotype ontology, or alternatively by a combinatorial annotation, using concepts from a range of disparate ontologies, to define a phenotypic character as an entity with an associated quality (EQ). Both methods have their merits, which include the dedicated approach allowing use of community standard terminology, and the combinatorial approach facilitating cross-species phenotypic statement comparisons. Previously databases have favoured one approach over another. The EUMODIC project will generate large amounts of mouse phenotype data, generated as a result of the execution of a set of Standard Operating Procedures (SOPs) and will implement both ontological approaches to capture the phenotype data generated.

Results: For all SOPs a four-tier annotation is made: a high-level description of the SOP, to broadly define the type of data generated by the SOP; individual parameter annotation using the EQ model; annotation of the qualitative data generated for each mouse; and the annotation of mutant lines after statistical analysis. The qualitative assessments of phenodeviance are made at the point of data entry, using child PATO qualities to the parameter quality. To facilitate data querying by scientists more familiar with single compound terms to describe phenotypes, the mappings between the Mammalian Phenotype (MP) ontology and the EQ PATO model are exploited to allow querying via MP terms.

Conclusion: Well-annotated and comparable phenotype databases can be achieved through the use of ontologically derived comparable phenotypic statements and have been implemented here by means of OBO compatible EQ annotations. The implementation we describe also sees scientists working seamlessly with ontologies through the assessment of qualitative phenotypes in terms of PATO qualities and the ability to query the database using community-accepted compound MP terms. This work represents the first time the combinatorial and single-dedicated approaches have both been implemented to annotate a phenotypic dataset.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Levels of ontology annotation of mouse phenotype data. The relationships between the levels of annotation are shown along with real annotations taken from the Hotplate SOP. a) The SOP is annotated using MP. Each parameter, representing a mouse trait, is defined using EQ. At the point of annotating individual mouse data, qualitative and quantitative parameters are handled differently. Qualitative parameters have a quality assigned to them, with a child-to-parent "is a" relationship to the parameter quality, and may be described using MP where a relevant concept exists. Quantitative parameters have a numerical value assigned to them. After comparison of the mutant line (cohort of individual mice) to the baseline data, statistically significant lines are annotated dynamically. Qualitative EQ data is annotated as being present at an increased or decreased frequency and quantitative data annotation using an increased or decreased based EQ statement, where the quality is a child of the parameter quality. In both cases, if a relevant MP term exists to define the direct phenotype it is assigned. The annotation of inferred phenotypes using MP terms is explained in the Discussion. b) Ontology terms are used to define two Hotplate SOP parameters. Example data is used to illustrate possible annotations of the mouse and the mutant strain. The annotations of the direct phenotypes allow association with an inferred phenotype.
Figure 2
Figure 2
Relationship between qualitative parameter and data. An example dysmorphology parameter and the corresponding value options are shown. The central photo shows a mouse with a sparse distribution of coat hair. A portion of the MA ontology is shown on the left and a portion of PATO (Revision 1.118) is shown on the right (visualised used OBO-Edit [21]). The highlighted terms are used to define the coat hair distribution parameter and the resulting phenotype annotation, illustrating the relationship between the parameter quality and the phenotype quality.
Figure 3
Figure 3
Incorporation of assay data into the annotation framework for an individual mouse. A simplified operating procedure is shown marked up using EXACT ontology concepts (left box). The phenotype and procedural data capture framework, to describe an instance of tail length, incorporates the experimental action from within the SOP where the phenotype data was obtained. Also shown is the relationship of parameter assayed_by SOP returns_value value, which is a child quality of the parameter quality qualified_by raw data.

References

    1. Brown SD, Hancock JM, Gates H. Understanding mammalian genetic systems: the challenge of phenotyping in the mouse. PLoS Genet. 2006;2:e118. doi: 10.1371/journal.pgen.0020118. - DOI - PMC - PubMed
    1. Cordes SP. N-ethyl-N-nitrosourea mutagenesis: boarding the mouse mutant express. Microbiol Mol Biol Rev. 2005;69:426–439. doi: 10.1128/MMBR.69.3.426-439.2005. - DOI - PMC - PubMed
    1. Auwerx J, Avner P, Baldock R, Ballabio A, Balling R, Barbacid M, Berns A, Bradley A, Brown S, Carmeliet P, et al. The European dimension for the mouse genome mutagenesis program. Nat Genet. 2004;36:925–927. doi: 10.1038/ng0904-925. - DOI - PMC - PubMed
    1. Collins FS, Rossant J, Wurst W. A mouse for all reasons. Cell. 2007;128:9–13. doi: 10.1016/j.cell.2006.12.018. - DOI - PubMed
    1. Hancock JM, Mallon AM. Phenobabelomics – mouse phenotype data resources. Brief Funct Genomic Proteomic. 2007;6:292–301. doi: 10.1093/bfgp/elm033. - DOI - PubMed

Publication types

LinkOut - more resources