Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Nov 28:9:243.
doi: 10.1186/1471-2180-9-243.

Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi

Affiliations

Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi

Alexander G Holman et al. BMC Microbiol. .

Abstract

Background: Wolbachia (wBm) is an obligate endosymbiotic bacterium of Brugia malayi, a parasitic filarial nematode of humans and one of the causative agents of lymphatic filariasis. There is a pressing need for new drugs against filarial parasites, such as B. malayi. As wBm is required for B. malayi development and fertility, targeting wBm is a promising approach. However, the lifecycle of neither B. malayi nor wBm can be maintained in vitro. To facilitate selection of potential drug targets we computationally ranked the wBm genome based on confidence that a particular gene is essential for the survival of the bacterium.

Results: wBm protein sequences were aligned using BLAST to the Database of Essential Genes (DEG) version 5.2, a collection of 5,260 experimentally identified essential genes in 15 bacterial strains. A confidence score, the Multiple Hit Score (MHS), was developed to predict each wBm gene's essentiality based on the top alignments to essential genes in each bacterial strain. This method was validated using a jackknife methodology to test the ability to recover known essential genes in a control genome. A second estimation of essentiality, the Gene Conservation Score (GCS), was calculated on the basis of phyletic conservation of genes across Wolbachia's parent order Rickettsiales. Clusters of orthologous genes were predicted within the 27 currently available complete genomes. Druggability of wBm proteins was predicted by alignment to a database of protein targets of known compounds.

Conclusion: Ranking wBm genes by either MHS or GCS predicts and prioritizes potentially essential genes. Comparison of the MHS to GCS produces quadrants representing four types of predictions: those with high confidence of essentiality by both methods (245 genes), those highly conserved across Rickettsiales (299 genes), those similar to distant essential genes (8 genes), and those with low confidence of essentiality (253 genes). These data facilitate selection of wBm genes for entry into drug design pipelines.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution of MHS values by rank in wBm. The X-axis indicates the 805 protein coding genes in the wBm genome, ranked by MHS. The Y-axis shows the value of the MHS for each protein.
Figure 2
Figure 2
E-values of the BLAST alignments producing the top 20 MHS. The black bars indicate the e-value of the best alignment to each organism within DEG. The y-axis is a linear scale of the negative log10 of the e-value, ranging from 1 to a maximal alignment of 200. The x-axis bins correspond to the 15 organisms contained within DEG.
Figure 3
Figure 3
Essential gene prediction by MHS was validated through a jackknife methodology. For each organism within DEG, the ability of the MHS to place experimentally validated essential genes at the top of a ranked genome was evaluated. All graphs correspond to the schematic found in the upper left. The X-axis represents the ranked genome of the organism, ranked from left to right as strongest to weakest prediction of essentiality. The Y-axis is the cumulative count of essential genes encountered moving left to right through the ranked genome. Line A is the ideal sorting, in which all essential genes are placed at the top of the ranking. Line B is the sorting by MHS. Lines C are 10 random assortments of the genome. Percent sorting achieved by MHS and the p-value for the difference between the MHS score ranking B and 1000 random assortments such as in C are shown in the lower right. Graphs are ordered by descending genome size of the organism. E. coli, F. novicida, and M. genitalium show 10, 2 and 2 fewer total essential genes, respectively, than shown in Table 1 because the corresponding DEG genes are not able to be resolved to genomic genes and are omitted from the jackknife analysis.
Figure 4
Figure 4
Distribution of GCS in wBm. The X-axis indicates the 805 protein coding genes in the wBm genome, ranked by GCS. The Y-axis shows the value of the GCS for each protein.
Figure 5
Figure 5
Comparison of the prediction of wBm gene essentiality by MHS and GCS. The X-axis shows normalized MHS on a log scale, while the Y-axis shows GCS. Grey lines indicate empirically determined thresholds for confidence in prediction of essentiality and are set at 7.3 × 10-3 for the MHS and 29 for the GCS. Therefore, the upper right quadrant contains genes with high confidence by both metrics. The upper left quadrant contains genes identified only by GCS, while the bottom right quadrant contains genes identified only by MHS. The numbers adjacent to the quadrant lines indicate gene counts in each quadrant. Red dots indicate Wolbachia genes which have significant protein sequence similarity to the targets of approved drugs and are predicted to be druggable.
Figure 6
Figure 6
Number of essential genes versus total number of Refseq genes. •-DEG organisms (V. cholerae omitted as an outlier). △-wBm essential gene prediction by MHS. ▽-wBm essential gene prediction by GCS score.

Similar articles

Cited by

References

    1. Bakheet TM, Doig AJ. Properties and identification of human protein drug targets. Bioinformatics. 2009;25(4):451–7. doi: 10.1093/bioinformatics/btp002. - DOI - PubMed
    1. Agüero F, Al-Lazikani B, Aslett M, Berriman M, Buckner FS, Campbell RK, Carmona S, Carruthers IM, Chan AW, Chen F, Crowther GJ, Doyle MA, Hertz-Fowler C, Hopkins AL, McAllister G, Nwaka S, Overington JP, Pain A, Paolini GV, Pieper U, Ralph SA, Riechers A, Roos DS, Sali A, Shanmugam D, Suzuki T, van Voorhis WC, Verlinde CL. Genomic-scale prioritization of drug targets: the TDR Targets database. Nat Rev Drug Discov. 2008;7(11):900–7. doi: 10.1038/nrd2684. - DOI - PMC - PubMed
    1. Zhang R, Lin Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Research. 2009. pp. D455–8. - DOI - PMC - PubMed
    1. Gerdes S, Edwards R, Kubal M, Fonstein M, Stevens R, Osterman A. Essential genes on metabolic maps. Curr Opin Biotechnol. 2006;17(5):448–56. doi: 10.1016/j.copbio.2006.08.006. - DOI - PubMed
    1. Behm CA, Bendig MM, McCarter JP, Sluder AE. RNAi-based discovery and validation of new drug targets in filarial nematodes. Trends Parasitol. 2005;21(3):97–100. doi: 10.1016/j.pt.2004.12.003. - DOI - PubMed

Publication types