Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr 25;8(4):e62204.
doi: 10.1371/journal.pone.0062204. Print 2013.

Bioinformatics analysis identify novel OB fold protein coding genes in C. elegans

Affiliations

Bioinformatics analysis identify novel OB fold protein coding genes in C. elegans

Daryanaz Dargahi et al. PLoS One. .

Abstract

Background: The C. elegans genome has been extensively annotated by the WormBase consortium that uses state of the art bioinformatics pipelines, functional genomics and manual curation approaches. As a result, the identification of novel genes in silico in this model organism is becoming more challenging requiring new approaches. The Oligonucleotide-oligosaccharide binding (OB) fold is a highly divergent protein family, in which protein sequences, in spite of having the same fold, share very little sequence identity (5-25%). Therefore, evidence from sequence-based annotation may not be sufficient to identify all the members of this family. In C. elegans, the number of OB-fold proteins reported is remarkably low (n=46) compared to other evolutionary-related eukaryotes, such as yeast S. cerevisiae (n=344) or fruit fly D. melanogaster (n=84). Gene loss during evolution or differences in the level of annotation for this protein family, may explain these discrepancies.

Methodology/principal findings: This study examines the possibility that novel OB-fold coding genes exist in the worm. We developed a bioinformatics approach that uses the most sensitive sequence-sequence, sequence-profile and profile-profile similarity search methods followed by 3D-structure prediction as a filtering step to eliminate false positive candidate sequences. We have predicted 18 coding genes containing the OB-fold that have remarkably partially been characterized in C. elegans.

Conclusions/significance: This study raises the possibility that the annotation of highly divergent protein fold families can be improved in C. elegans. Similar strategies could be implemented for large scale analysis by the WormBase consortium when novel versions of the genome sequence of C. elegans, or other evolutionary related species are being released. This approach is of general interest to the scientific community since it can be used to annotate any genome.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Superimposition of the novel OB-fold 3D-model with their templates.
(Light blue): Predicted 3D-models, (Wheat) PDB template. (.XXXX.)-nxxx name correspond to the protein name followed by the PDB code of the template.
Figure 2
Figure 2. Discovery Pipeline of novel OB fold protein coding genes.
It contains 3 Discovery Modules. SeqDIM: Sequence alignment DIscovery Module; StrucDIM:3D Structure prediction Discovery Module; and a Functional prediction Discovery Module FuncDIM.

Similar articles

Cited by

References

    1. C. elegans Sequencing Consortium (1998) Genome sequence of the nematode C. elegans: A platform for investigating biology. Science 282(5396): 2012–2018. - PubMed
    1. Stein LD, Bao Z, Blasiar D, Blumenthal T, Brent MR, et al. (2003) The genome sequence of caenorhabditis briggsae: A platform for comparative genomics. PLoS Biol 1(2): E45 10.1371/journal.pbio.0000045. - PMC - PubMed
    1. Magrane M, Consortium U (2011) UniProt knowledgebase: A hub of integrated protein data. Database (Oxford) 2011: bar009 10.1093/database/bar009. - PMC - PubMed
    1. Murzin AG (1998) How far divergent evolution goes in proteins. Curr Opin Struct Biol 8(3): 380–387. - PubMed
    1. Murzin AG (1993) OB(oligonucleotide/oligosaccharide binding)-fold: Common structural and functional solution for non-homologous sequences. EMBO J 12(3): 861–867. - PMC - PubMed

Substances