Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Nov 18:4:504.
doi: 10.1186/1756-0500-4-504.

Predictive models for anti-tubercular molecules using machine learning on high-throughput biological screening datasets

Affiliations

Predictive models for anti-tubercular molecules using machine learning on high-throughput biological screening datasets

Vinita Periwal et al. BMC Res Notes. .

Abstract

Background: Tuberculosis is a contagious disease caused by Mycobacterium tuberculosis (Mtb), affecting more than two billion people around the globe and is one of the major causes of morbidity and mortality in the developing world. Recent reports suggest that Mtb has been developing resistance to the widely used anti-tubercular drugs resulting in the emergence and spread of multi drug-resistant (MDR) and extensively drug-resistant (XDR) strains throughout the world. In view of this global epidemic, there is an urgent need to facilitate fast and efficient lead identification methodologies. Target based screening of large compound libraries has been widely used as a fast and efficient approach for lead identification, but is restricted by the knowledge about the target structure. Whole organism screens on the other hand are target-agnostic and have been now widely employed as an alternative for lead identification but they are limited by the time and cost involved in running the screens for large compound libraries. This could be possibly be circumvented by using computational approaches to prioritize molecules for screening programmes.

Results: We utilized physicochemical properties of compounds to train four supervised classifiers (Naïve Bayes, Random Forest, J48 and SMO) on three publicly available bioassay screens of Mtb inhibitors and validated the robustness of the predictive models using various statistical measures.

Conclusions: This study is a comprehensive analysis of high-throughput bioassay data for anti-tubercular activity and the application of machine learning approaches to create target-agnostic predictive models for anti-tubercular agents.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Sensitivity Plot. Plot of Sensitivity of each classifier for each dataset.
Figure 2
Figure 2
Specificity Plot. Plot of Specificity of each classifier for each dataset.
Figure 3
Figure 3
ROC Plot. ROC plot of Random Forest models of the three datasets.
Figure 4
Figure 4
Work flow. Workflow for data collection, descriptor analysis, model building and validation.

References

    1. World Health Organization. 2010/2011 Tuberculosis Global Facts. http://www.who.int/tb/publications/2010/factsheet_tb_2010.pdf
    1. World Health Organization. Tuberculosis Fact sheet N°104 November 2010. http://www.who.int/mediacentre/factsheets/fs104/en/
    1. Iseman DM. Evolution of drug-resistant tuberculosis: A tale of two species. Proc Natl Acad Sci USA. 1994;91:2428–2429. doi: 10.1073/pnas.91.7.2428. - DOI - PMC - PubMed
    1. World Health Organization. Towards universal access to diagnosis and treatment of multidrug-resistant and extensively drug-resistant tuberculosis by 2015. http://www.who.int/tb/challenges/mdr/factsheet_mdr_progress_march2011.pdf
    1. Lahana R. How many leads from HTS? Drug Discov Today. 1999;4:447–448. doi: 10.1016/S1359-6446(99)01393-8. - DOI - PubMed

LinkOut - more resources