New assembly, reannotation and analysis of the Entamoeba histolytica genome reveal new genomic features and protein content information
- PMID: 20559563
- PMCID: PMC2886108
- DOI: 10.1371/journal.pntd.0000716
New assembly, reannotation and analysis of the Entamoeba histolytica genome reveal new genomic features and protein content information
Abstract
Background: In order to maintain genome information accurately and relevantly, original genome annotations need to be updated and evaluated regularly. Manual reannotation of genomes is important as it can significantly reduce the propagation of errors and consequently diminishes the time spent on mistaken research. For this reason, after five years from the initial submission of the Entamoeba histolytica draft genome publication, we have re-examined the original 23 Mb assembly and the annotation of the predicted genes.
Principal findings: The evaluation of the genomic sequence led to the identification of more than one hundred artifactual tandem duplications that were eliminated by re-assembling the genome. The reannotation was done using a combination of manual and automated genome analysis. The new 20 Mb assembly contains 1,496 scaffolds and 8,201 predicted genes, of which 60% are identical to the initial annotation and the remaining 40% underwent structural changes. Functional classification of 60% of the genes was modified based on recent sequence comparisons and new experimental data. We have assigned putative function to 3,788 proteins (46% of the predicted proteome) based on the annotation of predicted gene families, and have identified 58 protein families of five or more members that share no homology with known proteins and thus could be entamoeba specific. Genome analysis also revealed new features such as the presence of segmental duplications of up to 16 kb flanked by inverted repeats, and the tight association of some gene families with transposable elements.
Significance: This new genome annotation and analysis represents a more refined and accurate blueprint of the pathogen genome, and provides an upgraded tool as reference for the study of many important aspects of E. histolytica biology, such as genome evolution and pathogenesis.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
References
-
- Ximenez C, Moran P, Rojas L, Valadez A, Gomez A. Reassessment of the epidemiology of amebiasis: state of the art. Infect Genet Evol. 2009;9:1023–1032. - PubMed
-
- Loftus B, Anderson I, Davies R, Alsmark UC, Samuelson J, et al. The genome of the protist parasite Entamoeba histolytica. Nature. 2005;433:865–868. - PubMed
-
- Sehgal D, Mittal V, Ramachandran S, Dhar SK, Bhattacharya A, et al. Nucleotide sequence organisation and analysis of the nuclear ribosomal DNA circle of the protozoan parasite Entamoeba histolytica. Mol Biochem Parasitol. 1994;67:205–214. - PubMed
-
- Roberts M, Hunt BR, Yorke JA, Bolanos RA, Delcher AL. A preprocessor for shotgun assembly of large genomes. J Comput Biol. 2004;11:734–752. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous
