Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jul;37(Web Server issue):W101-5.
doi: 10.1093/nar/gkp327. Epub 2009 May 8.

Orphelia: predicting genes in metagenomic sequencing reads

Affiliations

Orphelia: predicting genes in metagenomic sequencing reads

Katharina J Hoff et al. Nucleic Acids Res. 2009 Jul.

Abstract

Metagenomic sequencing projects yield numerous sequencing reads of a diverse range of uncultivated and mostly yet unknown microorganisms. In many cases, these sequencing reads cannot be assembled into longer contigs. Thus, gene prediction tools that were originally developed for whole-genome analysis are not suitable for processing metagenomes. Orphelia is a program for predicting genes in short DNA sequences that is available through a web server application (http://orphelia.gobics.de). Orphelia utilizes prediction models that were created with machine learning techniques on the basis of a wide range of annotated genomes. In contrast to other methods for metagenomic gene prediction, Orphelia has fragment length-specific prediction models for the two most popular sequencing techniques in metagenomics, chain termination sequencing and pyrosequencing. These models ensure highly specific gene predictions.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Orphelia's ORF scoring model. In Step 1, 7 ORF/fragment features are computed. Step 2 calculates a final gene probability, combining the features by means of a neural network.
Figure 2.
Figure 2.
Screenshot of the Orphelia web server application submission page.
Figure 3.
Figure 3.
Venn diagram of the number of million nucleotides predicted as protein encoding by FGENESB, Orphelia (Net700) and MetaGene in the hypersaline microbial mat metagenome samples.

References

    1. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc. Natl Acad. Sci. USA. 1977;74:5463–5467. - PMC - PubMed
    1. Ronaghi M, Uhlén M, Nyreén P. A sequencing method based on real-time pyrophosphate. Science. 1998;281:363–365. - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
    1. Krause L, Diaz NN, Bartels D, Edwards RA, Pühler A, Rohwer F, Meyer F, Stoye J. Finding novel genes in bacterial communities isolated from the environment. Bioinformatics. 2006;22:e281–e289. - PubMed
    1. Yooseph S, Li W, Sutton G. Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering. BMC Bioinformatics. 2008;9:182. - PMC - PubMed

Publication types