Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Dec 3:14:350.
doi: 10.1186/1471-2105-14-350.

Image-level and group-level models for Drosophila gene expression pattern annotation

Affiliations

Image-level and group-level models for Drosophila gene expression pattern annotation

Qian Sun et al. BMC Bioinformatics. .

Abstract

Background: Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.

Results: We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.

Conclusion: In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The training process of the group-level scheme and the image-level scheme. In the training process, one group of images is considered as one sample in the group-level scheme, while each image within a group is treated as a sample in the image-level scheme.
Figure 2
Figure 2
The testing process of the group-level scheme and the image-level scheme. In the testing stage, the group-level model will provide prediction for each group of images, while the image-level model will predict keywords for each image within this group. In addition, in the image-level scheme, we make a union of all the predicted keywords in this group, which will form the final prediction for this group of images.
Figure 3
Figure 3
The proposed framework ofDrosophila gene pattern annotation. Given an image, the SIFT detector is utilized to generate descriptors for local patches of this image. After SIFT detection, the BoW or sparse coding is applied to transform the descriptors into representers. Then the pooling functions map the representers into features. We then use these features to perform keywords annotation via different schemes.
Figure 4
Figure 4
Average vs majority vote for undersamplings. The y-axis of the figure indicates the AUC.
Figure 5
Figure 5
Image-level scheme with union vs image-level scheme without union for all stage ranges. The y-axis of the figure indicates the AUC.
Figure 6
Figure 6
Group-level scheme vs image-level scheme with union for all stage ranges. The y-axis of the figure indicates the AUC.

References

    1. Lecuyer E, Yoshida H, Parthasarathy N, Alm C, Babak T, Cerovina T, Hughes TR, Tomancak P, Krause HM. Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell. 2007;14:174–187. doi: 10.1016/j.cell.2007.08.003. [ http://www.sciencedirect.com/science/article/B6WSN-4PTNRXP-R/2/fb52595eb...] - DOI - PubMed
    1. Fowlkes CC, Luengo Hendriks CL, Keränen SV, Weber GH, Rübel O, Huang MY, Chatoor S, DePace AH, Simirenko L, Henriquez C, Beaton A, Weiszmann R, Celniker S, Hamann B, Knowles DW, Biggin MD, Eisen MB, Malik J. A quantitative spatiotemporal atlas of gene expression in the Drosophila blastoderm. Cell. 2008;14(2):364–374. doi: 10.1016/j.cell.2008.01.053. - DOI - PubMed
    1. Sean Carroll SW, Grenier J. From DNA to Diversity : Molecular Genetics and the Evolution of Animal Design. Malden, MA 02148, USA: Wiley-Blackwell; 2005.
    1. Levine M, Davidson EH. Gene regulatory networks for development. Proc Natl Acad Sci U S A. 2005;14(14):4936–4942. doi: 10.1073/pnas.0408031102. [ http://www.pnas.org/content/102/14/4936.abstract] - DOI - PMC - PubMed
    1. Matthews KA, Kaufman TC, Gelbart WM. Research resources for Drosophila: the expanding universe. Nat Rev Genet. 2005;14(3):179–193. doi: 10.1038/nrg1554. [ http://dx.doi.org/10.1038/nrg1554] - DOI - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources