Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Mar 12;22(6):2903.
doi: 10.3390/ijms22062903.

Incorporating Machine Learning into Established Bioinformatics Frameworks

Affiliations
Review

Incorporating Machine Learning into Established Bioinformatics Frameworks

Noam Auslander et al. Int J Mol Sci. .

Abstract

The exponential growth of biomedical data in recent years has urged the application of numerous machine learning techniques to address emerging problems in biology and clinical research. By enabling the automatic feature extraction, selection, and generation of predictive models, these methods can be used to efficiently study complex biological systems. Machine learning techniques are frequently integrated with bioinformatic methods, as well as curated databases and biological networks, to enhance training and validation, identify the best interpretable features, and enable feature and model investigation. Here, we review recently developed methods that incorporate machine learning within the same framework with techniques from molecular evolution, protein structure analysis, systems biology, and disease genomics. We outline the challenges posed for machine learning, and, in particular, deep learning in biomedicine, and suggest unique opportunities for machine learning techniques integrated with established bioinformatics approaches to overcome some of these challenges.

Keywords: bioinformatics methods; deep learning; machine learning; phylogenetics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Machine learning algorithms frequently used in bioinformatics research. An example of the usage of each algorithm and the respective input data are indicated on the right. Abbreviations: SVM, support vector machines; KNN, K-nearest neighbors; CNN, convolutional neural networks; RNN, recurrent neural networks; PCA, principal component analysis; t-SNE, t-distributed stochastic neighbor embedding, NMF, non-negative matrix factorization.
Figure 2
Figure 2
Applications of integrated machine learning techniques with bioinformatics in molecular evolution, protein structure analysis, systems biology, and disease genomics.

References

    1. Pevsner J. Bioinformatics and Functional Genomics. John Wiley & Sons; Hoboken, NJ, USA: 2015. Funtional Genomics.
    1. Ayyildiz D., Piazza S. Methods in Molecular Biology. Oxford University Press; Oxford, UK: 2019. Introduction to Bioinformatics. - PubMed
    1. Wodarz D., Komarova N. Computational Biology of Cancer. World Scientific; Singapore: 2005.
    1. Lecun Y., Bengio Y., Hinton G. Deep Learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. - DOI - PubMed
    1. Butler K.T., Davies D.W., Cartwright H., Isayev O., Walsh A. Machine learning for molecular and materials science. Nat. Cell Biol. 2018;559:547–555. doi: 10.1038/s41586-018-0337-2. - DOI - PubMed

LinkOut - more resources