Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Nov 14:18:3494-3506.
doi: 10.1016/j.csbj.2020.11.007. eCollection 2020.

Homology modeling in the time of collective and artificial intelligence

Affiliations
Review

Homology modeling in the time of collective and artificial intelligence

Tareq Hameduh et al. Comput Struct Biotechnol J. .

Abstract

Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural similarities with other proteins. The homology modeling process is done in sequential steps where sequence/structure alignment is optimized, then a backbone is built and later, side-chains are added. Once the low-homology loops are modeled, the whole 3D structure is optimized and validated. In the past three decades, a few collective and collaborative initiatives allowed for continuous progress in both homology and ab initio modeling. Critical Assessment of protein Structure Prediction (CASP) is a worldwide community experiment that has historically recorded the progress in this field. Folding@Home and Rosetta@Home are examples of crowd-sourcing initiatives where the community is sharing computational resources, whereas RosettaCommons is an example of an initiative where a community is sharing a codebase for the development of computational algorithms. Foldit is another initiative where participants compete with each other in a protein folding video game to predict 3D structure. In the past few years, contact maps deep machine learning was introduced to the 3D structure prediction process, adding more information and increasing the accuracy of models significantly. In this review, we will take the reader in a journey of exploration from the beginnings to the most recent turnabouts, which have revolutionized the field of homology modeling. Moreover, we discuss the new trends emerging in this rapidly growing field.

Keywords: Artificial intelligence; Collective intelligence; Homology modeling; Machine learning; Protein 3D structure; Structural bioinformatics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Historical timeline of major developments in homology modeling, taking into consideration the developments in collective and artificial intelligence fields. CASP: Critical Assessment of protein Structure Prediction. GPU: Graphics Processing Unit. TPU: Tensor Processing Unit.
Fig. 2
Fig. 2
The seven classical steps of homology modeling. Donut shapes describe the major events influencing some of the homology modeling steps.
Fig. 3
Fig. 3
Yearly citations of widely used homology modeling programs (defined as those having > 1000 total citations). MODELLER (red) and SWISS-MODEL (blue), which date over two decades are the most popular among researchers, whereas the popularity of I-TASSER (orange) and Phyre2 (green) is on the rise (source: webofknowledge.com). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4
Fig. 4
Yearly citations of CASP13 highly-performing homology modeling programs. I-TASSER and RaptorX are the most rising in popularity among researchers (source: webofknowledge.com).
Fig. 5
Fig. 5
Neural network strategies used in protein 3D structure prediction tools (as described in [133], [139]). (A) Convolutional Neural Networks – CNN are employed by merging 1D features and 2D features dimensions into residue blocks that are used as input matrix for convolutional layers. Just like an optic nerve, the residue blocks are convolved into smaller and smaller layers. (B) Recurrent Neural Networks – RNN are trained for generating sequences. (C) Variational Auto-Encoder – VAE is used for creating similar structures that are correlated with an input structure. Properties are calculated by constructing a latent space map, which is then used to produce outputs. (D) Generative Adversarial Networks – GAN use a gaming method to discriminate real input from fake input that is produced from a generator. The game continues until the discriminator is unable to distinguish the real from fake outputs.

Similar articles

Cited by

References

    1. Hargittai I. Linus Pauling’s quest for the structure of proteins. Struct. Chem. 2009;21(1):1–7.
    1. Muhammed M.T., Aki-Yalcin E. Homology modeling in drug discovery: Overview, current applications, and future perspectives. Chem. Biol. Drug Des. 2019;93(1):12–20. - PubMed
    1. Hatfield M.P., Lovas S. Conformational sampling techniques. Curr. Pharm. Des. 2014;20(20):3303–3313. - PubMed
    1. Moult J. A large-scale experiment to assess protein structure prediction methods. Proteins. 1995;23(3):2–4. - PubMed
    1. Samuel A.L. Some Studies in Machine Learning Using the Game of Checkers. IBM J. Res. Dev. 1959;3(3):210–229.

LinkOut - more resources