Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Apr 15;27(8):1101-7.
doi: 10.1093/bioinformatics/btr105. Epub 2011 Feb 25.

Automatically identifying and annotating mouse embryo gene expression patterns

Affiliations

Automatically identifying and annotating mouse embryo gene expression patterns

Liangxiu Han et al. Bioinformatics. .

Abstract

Motivation: Deciphering the regulatory and developmental mechanisms for multicellular organisms requires detailed knowledge of gene interactions and gene expressions. The availability of large datasets with both spatial and ontological annotation of the spatio-temporal patterns of gene expression in mouse embryo provides a powerful resource to discover the biological function of embryo organization. Ontological annotation of gene expressions consists of labelling images with terms from the anatomy ontology for mouse development. If the spatial genes of an anatomical component are expressed in an image, the image is then tagged with a term of that anatomical component. The current annotation is done manually by domain experts, which is both time consuming and costly. In addition, the level of detail is variable, and inevitably errors arise from the tedious nature of the task. In this article, we present a new method to automatically identify and annotate gene expression patterns in the mouse embryo with anatomical terms.

Results: The method takes images from in situ hybridization studies and the ontology for the developing mouse embryo, it then combines machine learning and image processing techniques to produce classifiers that automatically identify and annotate gene expression patterns in these images. We evaluate our method on image data from the EURExpress study, where we use it to automatically classify nine anatomical terms: humerus, handplate, fibula, tibia, femur, ribs, petrous part, scapula and head mesenchyme. The accuracy of our method lies between 70% and 80% with few exceptions. We show that other known methods have lower classification performance than ours. We have investigated the images misclassified by our method and found several cases where the original annotation was not correct. This shows our method is robust against this kind of noise.

Availability: The annotation result and the experimental dataset in the article can be freely accessed at http://www2.docm.mmu.ac.uk/STAFF/L.Han/geneannotation/.

Contact: l.han@mmu.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Formulation of automatic identification and annotation of spatial gene expression patterns. (a) An example of annotating an image using a term from the anatomy ontology for the developing mouse embryo. (b) A high-level overview of the method for automatic annotation of images from ISH studies.
Fig. 2.
Fig. 2.
An example to show the effect of a mouse embryo image before and after the median filter. (a) Before using median filter. (b) After using median filter.
Fig. 3.
Fig. 3.
An example of the process and result of wavelet decomposition. (a) Wavelet decomposition on 2D-array. (b) Wavelet decomposition on an image.
Fig. 4.
Fig. 4.
The overall classification performance in terms of sensitivity × specificity (higher is better) of the five classification methods (LDA, SVM, LSVM, ANN and CNN) for each anatomical component.
Fig. 5.
Fig. 5.
Several examples where, according to the data, our technique has falsely classified the images, but where, on closer inspection the human annotation is not consistent with the image. (a) Euxassay 010351, section 1: erroneous false positive for humerus—curator has annotated humerus on Sections 2 and 3, but missed expression on Section 1. (b) Euxassay 010262, section 1: erroneous false positive for humerus—has no annotation due to error of curator; expression clearly visible in several sections. (c) Euxassay 003591, Section 1: erroneous false negative for humerus—curator has annotated based on gene expression seen in other sections where none is found here.

References

    1. Baldock R., et al. Emap and emage: a framework for understanding spatially organised data. Neuroinformatics. 2003;1:309–325. - PubMed
    1. Baxes G.A. Digital Image Processing: Principles and Applications. New York, NY, USA: Wiley; 1994.
    1. Brown P.O., Botstein D. Exploring the new world of the genome with DNA microarrays. Nat. Genet. 1999;21:33–37. - PubMed
    1. Carson J.P., et al. A digital atlas to characterize the mouse brain transcriptome. PLoS Comput. Biol. 2005;1:0290–0296. - PMC - PubMed
    1. Christiansen J.H., et al. Emage: a spatial database of gene expression patterns during mouse embryo development. Nucleic Acids Res. 2006;34:D637–D641. - PMC - PubMed

Publication types