Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Aug;25(3):1315-1360.
doi: 10.1007/s11030-021-10217-3. Epub 2021 Apr 12.

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

Affiliations
Review

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

Rohan Gupta et al. Mol Divers. 2021 Aug.

Abstract

Drug designing and development is an important area of research for pharmaceutical companies and chemical scientists. However, low efficacy, off-target delivery, time consumption, and high cost impose a hurdle and challenges that impact drug design and discovery. Further, complex and big data from genomics, proteomics, microarray data, and clinical trials also impose an obstacle in the drug discovery pipeline. Artificial intelligence and machine learning technology play a crucial role in drug discovery and development. In other words, artificial neural networks and deep learning algorithms have modernized the area. Machine learning and deep learning algorithms have been implemented in several drug discovery processes such as peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modeling, quantitative structure-activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Evidence from the past strengthens the implementation of artificial intelligence and deep learning in this field. Moreover, novel data mining, curation, and management techniques provided critical support to recently developed modeling algorithms. In summary, artificial intelligence and deep learning advancements provide an excellent opportunity for rational drug design and discovery process, which will eventually impact mankind. The primary concern associated with drug design and development is time consumption and production cost. Further, inefficiency, inaccurate target delivery, and inappropriate dosage are other hurdles that inhibit the process of drug delivery and development. With advancements in technology, computer-aided drug design integrating artificial intelligence algorithms can eliminate the challenges and hurdles of traditional drug design and development. Artificial intelligence is referred to as superset comprising machine learning, whereas machine learning comprises supervised learning, unsupervised learning, and reinforcement learning. Further, deep learning, a subset of machine learning, has been extensively implemented in drug design and development. The artificial neural network, deep neural network, support vector machines, classification and regression, generative adversarial networks, symbolic learning, and meta-learning are examples of the algorithms applied to the drug design and discovery process. Artificial intelligence has been applied to different areas of drug design and development process, such as from peptide synthesis to molecule design, virtual screening to molecular docking, quantitative structure-activity relationship to drug repositioning, protein misfolding to protein-protein interactions, and molecular pathway identification to polypharmacology. Artificial intelligence principles have been applied to the classification of active and inactive, monitoring drug release, pre-clinical and clinical development, primary and secondary drug screening, biomarker development, pharmaceutical manufacturing, bioactivity identification and physiochemical properties, prediction of toxicity, and identification of mode of action.

Keywords: Artificial intelligence; Artificial neural networks; Computer-aided drug design; Deep learning; Drug design and discovery; Drug repurposing; Machine learning; Quantitative structure–activity relationship; Virtual screening.

PubMed Disclaimer

Conflict of interest statement

There is no conflict of interest declared by the authors.

Figures

Fig. 1
Fig. 1
a History of artificial intelligence in healthcare: the first breakthrough of artificial intelligence in healthcare comes in 1950 with the development of turning tests. Later on, in 1975, the first research resource on computers in medicines was developed, followed by NIH's first central AIM workshop marked the importance of artificial intelligence in healthcare. With the development of deep learning in the 2000s and the introduction of DeepQA in 2007, the scope of artificial intelligence in healthcare has increased. Further, in 2010 CAD was applied to endoscopy for the first time, whereas, in 2015, the first Pharmbot was developed. In 2017, the first FDA-approved cloud-based DL application was introduced, which also marked the implementation of artificial intelligence in healthcare. From 2018 to 2020 several AI trials in gastroenterology were performed. b Classification of artificial intelligence: there are seven classifications of artificial intelligence, which are reasoning and problem solving, knowledge representation, planning and social intelligence, perception, machine learning, robotics: motion and manipulation, and natural language processing, as discussed by Russel and Norvig in their book “Artificial Intelligence: A Modern Approach.” Machine learning is further divided into three significant subsets: supervised learning, unsupervised learning, and deep learning, whereas vision is divided into two subsets, such as image recognition and machine vision. Similarly, speech is divided into two subsets: speech to text and text to speech, whereas natural language processing is classified into five main subsets, including classification, machine translation, question answering, text generation, and content extraction. c Artificial intelligence in the healthcare and pharmaceutical industry has five significant applications, which change the entire scenario. These applications include research and discovery, clinical development, manufacturing and supply chain, patient surveillance, and post-market surveillance
Fig. 2
Fig. 2
Application of big data for drug designing and discovery: with the increase in biological and chemical data from the literature, in vitro, in vivo, clinical studies, genomics studies, proteomics studies, metabolomics studies, gene ontology studies, and molecular pathway data, different data repositories have been developed. For instance, ChemSpider, ChEMBL, ZINC, BindingDB, and PubChem are the essential databases for compound synthesis and screening in the drug designing and discovery process. The data stored in the above-said databases were curated and screened out for pharmacological and physicochemical properties of compound necessary for the drug discovery process instead of quantum mechanical calculations such as solvation energy and proton affinity the wave function, atomic forces, and transition state. The high-throughput screened data were subject to filtration based on drug-likeness, PAINS calculation, ADMET analysis, and toxicity. The filtered compounds were subject to artificial intelligence models such as deep learning, random forest, classification and regression, and neural networks for further analysis. These compounds were then subjected to quantitative-structure activity relationship and pharmacophore models followed by molecular docking and molecular dynamics simulations studies. Afterward, the final predicted compounds were visualized for binding energy calculations and active site identification. Thus, the final compound was identified and underwent in vitro and in vivo experimental studies for validation. However, quantum mechanical properties play a crucial role in the process of drug discovery and designing, but these properties cannot directly hamper the process of drug designing. QM methods include ab initio density functional theory and semi-empirical calculations, where accurate calculations use electron correlation methods. QM will become a more prominent tool in the repertoire of the computational medicinal chemist. Therefore, modern QM approaches will play a more direct role in informing and streamlining the drug-discovery process
Fig. 3
Fig. 3
Artificial intelligence in primary and secondary drug screening: in drug discovery and designing pipeline, screening of potential lead is crucial, and artificial intelligence plays a great role in identifying novel and potential lead compounds. There are approximately 106 million chemical structure presents in chemical space from different studies such as OMIC studies, clinical and pre-clinical studies, in vivo assays, and microarray analysis. With machine learning models such as reinforcement models, logistic models, regression models, and generative models, these chemical structures are screened out based on active sites, structure, and target binding ability. The complete drug discovery process through artificial intelligence will take about 14–18 years, which is comparatively less than the traditional drug discovery process. The first step in the drug discovery process is lead identification, in which disease-modifying target protein is identified through reverse docking, bioinformatics analysis, and computational chemical biology. In the second step, primary screening of compounds is done to select potential lead compounds, which can inhibit target protein. This can be done through virtual screening and de novo designing. The next step in the drug discovery process includes lead optimization and lead compound identification through focused library design, drug-like analysis, drug-target reproducibility, and computational biology. Afterward, secondary screening of compounds is performed, followed by pre-clinical trials. The drug discovery process's final step is clinical development through cell-culture analysis, animal model experimentation, and patient analysis
Fig. 4
Fig. 4
a Ligand-based virtual screening: in the drug design and discovery process, ligand-based virtual screening is the most crucial step, which comprises different steps as shown in the figure. The initial step consists of database screening and the 3-D structural model's prediction through the active site for a special target and X-ray structure of complexes. Later on, pharmacophore modeling of selected compounds with selected features is performed, followed by pharmacophore and docking-based virtual screening of compounds. The screened compounds are subjected to different toxicity and physiochemical properties for further analysis. Finally, the lead compounds are subjected to in vitro and in vivo bioassays for validation. b structure-based virtual screening: it is another type of virtual screening applied in the drug discovery process, where target structure preparation and chemical compound library preparation are initial steps. Afterward, structural analysis and binding site prediction are done, followed by molecular docking of compounds with the selected target. Later on, molecular dynamics simulation studies are carried out to validate the screened compounds in silico, followed by experimental validation through bioassays
Fig. 5
Fig. 5
a Quantitative structure–activity relationship workflow: the initial step comprises of data set compilation, where data from public database and literature database are accumulated and compiled, which further divided into different subsets for investigation. Afterward, data set processing is performed, where data pre-processing and curation followed by calculation of molecular descriptors are done. After description calculation, data set processing normalization of data and splitting of data into different sets are performed. In the third step, model construction is performed, where data sets such as internal data and external data are accumulated, and learning algorithms are applied for QSAR modeling. Finally, the statistical calculation is done to measure the model robustness. The final step in the quantitative-structure activity relationship is model evaluation, where the model is evaluated by comparison from previous benchmark models, identifying characteristics features, performance evaluation, and interpretation of essential features. b Drug repurposing or repositioning workflow: the first step is collection of data and data pre-processing followed by computational model generation. The models generated are support vector machines, logistic regression, random forest, deep learning, and matrix factorization. Afterward, the generation of proof-of-concept from a literature source is performed. Later on, evaluation of repositioning models through cross-validation, case analysis, and evaluation metrics is performed. Finally, validation of repurposed drugs is carried out through clinical trials, in vitro studies, and in vivo studies

References

    1. Lipinski CF, Maltarollo VG, Oliveira PR, et al. Advances and perspectives in applying deep learning for drug design and discovery. Front Robot AI. 2019 doi: 10.3389/frobt.2019.00108. - DOI - PMC - PubMed
    1. Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism. 2017 doi: 10.1016/j.metabol.2017.01.011. - DOI - PubMed
    1. Hassanzadeh P, Atyabi F, Dinarvand R. The significance of artificial intelligence in drug delivery system design. Adv Drug Deliv Rev. 2019 doi: 10.1016/j.addr.2019.05.001. - DOI - PubMed
    1. Duch W, Swaminathan K, Meller J. Artificial intelligence approaches for rational drug design and discovery. Curr Pharm Des. 2007 doi: 10.2174/138161207780765954. - DOI - PubMed
    1. Zhang L, Tan J, Han D, Zhu H. From machine learning to deep learning: progress in machine intelligence for rational drug discovery. Drug Discov Today. 2017 doi: 10.1016/j.drudis.2017.08.010. - DOI - PubMed