. 2009 Nov 28:9:243.

doi: 10.1186/1471-2180-9-243.

Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi

Alexander G Holman¹, Paul J Davis, Jeremy M Foster, Clotilde K S Carlow, Sanjay Kumar

Affiliations

PMID: 19943957
PMCID: PMC2794283
DOI: 10.1186/1471-2180-9-243

Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi

Alexander G Holman et al. BMC Microbiol. 2009.

. 2009 Nov 28:9:243.

doi: 10.1186/1471-2180-9-243.

Authors

Alexander G Holman¹, Paul J Davis, Jeremy M Foster, Clotilde K S Carlow, Sanjay Kumar

Affiliation

¹ New England Biolabs, 240 County Road, Ipswich, MA 01938-2723, USA. holman@neb.com

PMID: 19943957
PMCID: PMC2794283
DOI: 10.1186/1471-2180-9-243

Abstract

Background: Wolbachia (wBm) is an obligate endosymbiotic bacterium of Brugia malayi, a parasitic filarial nematode of humans and one of the causative agents of lymphatic filariasis. There is a pressing need for new drugs against filarial parasites, such as B. malayi. As wBm is required for B. malayi development and fertility, targeting wBm is a promising approach. However, the lifecycle of neither B. malayi nor wBm can be maintained in vitro. To facilitate selection of potential drug targets we computationally ranked the wBm genome based on confidence that a particular gene is essential for the survival of the bacterium.

Results: wBm protein sequences were aligned using BLAST to the Database of Essential Genes (DEG) version 5.2, a collection of 5,260 experimentally identified essential genes in 15 bacterial strains. A confidence score, the Multiple Hit Score (MHS), was developed to predict each wBm gene's essentiality based on the top alignments to essential genes in each bacterial strain. This method was validated using a jackknife methodology to test the ability to recover known essential genes in a control genome. A second estimation of essentiality, the Gene Conservation Score (GCS), was calculated on the basis of phyletic conservation of genes across Wolbachia's parent order Rickettsiales. Clusters of orthologous genes were predicted within the 27 currently available complete genomes. Druggability of wBm proteins was predicted by alignment to a database of protein targets of known compounds.

Conclusion: Ranking wBm genes by either MHS or GCS predicts and prioritizes potentially essential genes. Comparison of the MHS to GCS produces quadrants representing four types of predictions: those with high confidence of essentiality by both methods (245 genes), those highly conserved across Rickettsiales (299 genes), those similar to distant essential genes (8 genes), and those with low confidence of essentiality (253 genes). These data facilitate selection of wBm genes for entry into drug design pipelines.

PubMed Disclaimer

Figures

**Figure 1**
**Distribution of MHS values by rank in wBm**. The X-axis indicates the 805 protein coding genes in the wBm genome, ranked by MHS. The Y-axis shows the value of the MHS for each protein.

**Figure 2**
**E-values of the BLAST alignments producing the top 20 MHS**. The black bars indicate the e-value of the best alignment to each organism within DEG. The y-axis is a linear scale of the negative log₁₀of the e-value, ranging from 1 to a maximal alignment of 200. The x-axis bins correspond to the 15 organisms contained within DEG.

**Figure 3**
**Essential gene prediction by MHS was validated through a jackknife methodology**. For each organism within DEG, the ability of the MHS to place experimentally validated essential genes at the top of a ranked genome was evaluated. All graphs correspond to the schematic found in the upper left. The X-axis represents the ranked genome of the organism, ranked from left to right as strongest to weakest prediction of essentiality. The Y-axis is the cumulative count of essential genes encountered moving left to right through the ranked genome. Line A is the ideal sorting, in which all essential genes are placed at the top of the ranking. Line B is the sorting by MHS. Lines C are 10 random assortments of the genome. Percent sorting achieved by MHS and the p-value for the difference between the MHS score ranking B and 1000 random assortments such as in C are shown in the lower right. Graphs are ordered by descending genome size of the organism. *E. coli*, *F. novicida*, and *M. genitalium* show 10, 2 and 2 fewer total essential genes, respectively, than shown in Table 1 because the corresponding DEG genes are not able to be resolved to genomic genes and are omitted from the jackknife analysis.

**Figure 4**
**Distribution of GCS in wBm**. The X-axis indicates the 805 protein coding genes in the wBm genome, ranked by GCS. The Y-axis shows the value of the GCS for each protein.

**Figure 5**
**Comparison of the prediction of wBm gene essentiality by MHS and GCS**. The X-axis shows normalized MHS on a log scale, while the Y-axis shows GCS. Grey lines indicate empirically determined thresholds for confidence in prediction of essentiality and are set at 7.3 × 10^-3for the MHS and 29 for the GCS. Therefore, the upper right quadrant contains genes with high confidence by both metrics. The upper left quadrant contains genes identified only by GCS, while the bottom right quadrant contains genes identified only by MHS. The numbers adjacent to the quadrant lines indicate gene counts in each quadrant. Red dots indicate Wolbachia genes which have significant protein sequence similarity to the targets of approved drugs and are predicted to be druggable.

**Figure 6**
**Number of essential genes versus total number of Refseq genes**. •-DEG organisms (*V. cholerae* omitted as an outlier). △-wBm essential gene prediction by MHS. ▽-wBm essential gene prediction by GCS score.

See this image and copyright information in PMC

Cited by

The genome of the heartworm, Dirofilaria immitis, reveals drug and vaccine targets.
Godel C, Kumar S, Koutsovoulos G, Ludin P, Nilsson D, Comandatore F, Wrobel N, Thompson M, Schmid CD, Goto S, Bringaud F, Wolstenholme A, Bandi C, Epe C, Kaminsky R, Blaxter M, Mäser P. Godel C, et al. FASEB J. 2012 Nov;26(11):4650-61. doi: 10.1096/fj.12-205096. Epub 2012 Aug 13. FASEB J. 2012. PMID: 22889830 Free PMC article.
Training set selection for the prediction of essential genes.
Cheng J, Xu Z, Wu W, Zhao L, Li X, Liu Y, Tao S. Cheng J, et al. PLoS One. 2014 Jan 22;9(1):e86805. doi: 10.1371/journal.pone.0086805. eCollection 2014. PLoS One. 2014. PMID: 24466248 Free PMC article.
A Novel Computational Approach for Identifying Essential Proteins From Multiplex Biological Networks.
Zhao B, Hu S, Liu X, Xiong H, Han X, Zhang Z, Li X, Wang L. Zhao B, et al. Front Genet. 2020 Apr 21;11:343. doi: 10.3389/fgene.2020.00343. eCollection 2020. Front Genet. 2020. PMID: 32373163 Free PMC article.
Putative essential and core-essential genes in Mycoplasma genomes.
Lin Y, Zhang RR. Lin Y, et al. Sci Rep. 2011;1:53. doi: 10.1038/srep00053. Epub 2011 Aug 3. Sci Rep. 2011. PMID: 22355572 Free PMC article.
A new computational strategy for predicting essential genes.
Cheng J, Wu W, Zhang Y, Li X, Jiang X, Wei G, Tao S. Cheng J, et al. BMC Genomics. 2013 Dec 21;14:910. doi: 10.1186/1471-2164-14-910. BMC Genomics. 2013. PMID: 24359534 Free PMC article.

See all "Cited by" articles

References

1. Bakheet TM, Doig AJ. Properties and identification of human protein drug targets. Bioinformatics. 2009;25(4):451–7. doi: 10.1093/bioinformatics/btp002. - DOI - PubMed
1. Agüero F, Al-Lazikani B, Aslett M, Berriman M, Buckner FS, Campbell RK, Carmona S, Carruthers IM, Chan AW, Chen F, Crowther GJ, Doyle MA, Hertz-Fowler C, Hopkins AL, McAllister G, Nwaka S, Overington JP, Pain A, Paolini GV, Pieper U, Ralph SA, Riechers A, Roos DS, Sali A, Shanmugam D, Suzuki T, van Voorhis WC, Verlinde CL. Genomic-scale prioritization of drug targets: the TDR Targets database. Nat Rev Drug Discov. 2008;7(11):900–7. doi: 10.1038/nrd2684. - DOI - PMC - PubMed
1. Zhang R, Lin Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Research. 2009. pp. D455–8. - DOI - PMC - PubMed
1. Gerdes S, Edwards R, Kubal M, Fonstein M, Stevens R, Osterman A. Essential genes on metabolic maps. Curr Opin Biotechnol. 2006;17(5):448–56. doi: 10.1016/j.copbio.2006.08.006. - DOI - PubMed
1. Behm CA, Bendig MM, McCarter JP, Sluder AE. RNAi-based discovery and validation of new drug targets in filarial nematodes. Trends Parasitol. 2005;21(3):97–100. doi: 10.1016/j.pt.2004.12.003. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi

Affiliation

Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Research Materials

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Research Materials