Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan 15:11:35.
doi: 10.1186/1471-2164-11-35.

Assessing functional annotation transfers with inter-species conserved coexpression: application to Plasmodium falciparum

Affiliations

Assessing functional annotation transfers with inter-species conserved coexpression: application to Plasmodium falciparum

Laurent Bréhélin et al. BMC Genomics. .

Abstract

Background: Plasmodium falciparum is the main causative agent of malaria. Of the 5 484 predicted genes of P. falciparum, about 57% do not have sufficient sequence similarity to characterized genes in other species to warrant functional assignments. Non-homology methods are thus needed to obtain functional clues for these uncharacterized genes. Gene expression data have been widely used in the recent years to help functional annotation in an intra-species way via the so-called Guilt By Association (GBA) principle.

Results: We propose a new method that uses gene expression data to assess inter-species annotation transfers. Our approach starts from a set of likely orthologs between a reference species (here S. cerevisiae and D. melanogaster) and a query species (P. falciparum). It aims at identifying clusters of coexpressed genes in the query species whose coexpression has been conserved in the reference species. These conserved clusters of coexpressed genes are then used to assess annotation transfers between genes with low sequence similarity, enabling reliable transfers of annotations from the reference to the query species. The approach was used with transcriptomic data sets of P. falciparum, S. cerevisiae and D. melanogaster, and enabled us to propose with high confidence new/refined annotations for several dozens hypothetical/putative P. falciparum genes. Notably, we revised the annotation of genes involved in ribosomal proteins and ribosome biogenesis and assembly, thus highlighting several potential drug targets.

Conclusions: Our approach uses both sequence similarity and gene expression data to help inter-species gene annotation transfers. Experiments show that this strategy improves the accuracy achieved when using solely sequence similarity and outperforms the accuracy of the GBA approach. In addition, our experiments with P. falciparum show that it can infer a function for numerous hypothetical genes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The probabilistic model. For each species, only three prototypes are represented here, while several dozens are usually used. Prototypes are modeled with multivariate Gaussian models represented here by series of means and standard deviations. P. falciparum prototypes are labeled with their prior probabilities. Prior probabilities of the prototype of the reference species are on the outgoing transitions from the mute state M. Direct transitions corresponding to evolutionary conservation between prototypes are in bold.
Figure 2
Figure 2
Estimate of the method accuracy with GO: D. melanogaster vs. S. cerevisiae. Accuracy achieved on the MF (left) and BP (right) ontologies when using sequence information alone (orange curves, for different e-value cutoffs) and when exploiting the expression context (blue curves, for different ξ thresholds). The x-axis shows the number of gene pairs authorizing annotation transfers. (a) Results achieved on RBH gene pairs. (b) Results achieved on gene pairs with potentially low sequence similarity.
Figure 3
Figure 3
Estimate of the method accuracy with GO: P. falciparum vs. S. cerevisiae. Accuracies achieved on the Le Roch-Gasch (orange curve) and Bozdech-Spellman (blue curve) comparisons, estimated on the MF (left) and BP (right) ontologies with different ξ thresholds—from left to right on each curve: 40, 20, 10, 5, 2, 1. The x-axis shows the number of gene pairs authorizing annotation transfers. The brown curves show the accuracy achieve when associating each gene of P. falciparum with its best BLAST hit in S. cerevisiae, for different e-value thresholds.
Figure 4
Figure 4
Comparison with a GBA approach. Accuracy achieved by the OPI approach (brown curves) on the MF (left) and BP (right) ontologies when using different FDR thresholds. The orange and blue curves show the accuracy achieved on the Le Roch-Gasch and Bozdech-Spellman experiments, respectively.

References

    1. Tuteja R. Malaria - an overview. FEBS J. 2007;274(18):4670–4679. doi: 10.1111/j.1742-4658.2007.05997.x. - DOI - PubMed
    1. Yeh I, Altman R. Drug Targets for Plasmodium falciparum: a post-genomic review/survey. Mini Rev Med Chem. 2006;6(2):177–202. doi: 10.2174/138955706775475957. - DOI - PubMed
    1. Greenwood B, Fidock D, Kyle D, Kappe S, Alonso P, Collins F, Duffy P. Malaria: progress, perils, and prospects for eradication. J Clin Invest. 2008;118(4):1266–1276. doi: 10.1172/JCI33996. - DOI - PMC - PubMed
    1. Gardner M, Hall N, Fung E, White O, Berriman M, Hyman R, Carlton J, Pain A, Nelson K, Bowman S, Paulsen I, James K, Eisen J, Rutherford K, Salzberg S, Craig A, Kyes S, Chan M, Nene V, Shallom S, Suh B, Peterson J, Angiuoli S, Pertea M, Allen J, Selengut J, Haft D, Mather M, Vaidya A, Martin D, Fairlamb A, Fraunholz M, Roos D, Ralph S, McFadden G, Cummings L, Subramanian G, Mungall C, Venter J, Carucci D, Hoffman S, Newbold C, Davis R, Fraser C, Barrell B. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature. 2002;419(6906):498–511. doi: 10.1038/nature01097. - DOI - PMC - PubMed
    1. Adl S, Leander B, Simpson A, Archibald J, Anderson O, Bass D, Bowser S, Brugerolle G, Farmer M, Karpov S, Kolisko M, Lane C, Lodge D, Mann D, Meisterfeld R, Mendoza L, Moestrup O, Mozley-Standridge S, Smirnov A, Spiegel F. Diversity, nomenclature, and taxonomy of protists. Syst Biol. 2007;56(4):684–689. doi: 10.1080/10635150701494127. - DOI - PubMed

Publication types

Substances

LinkOut - more resources