Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 23;22(21):11449.
doi: 10.3390/ijms222111449.

Ensemble of Template-Free and Template-Based Classifiers for Protein Secondary Structure Prediction

Affiliations

Ensemble of Template-Free and Template-Based Classifiers for Protein Secondary Structure Prediction

Gabriel Bianchin de Oliveira et al. Int J Mol Sci. .

Abstract

Protein secondary structures are important in many biological processes and applications. Due to advances in sequencing methods, there are many proteins sequenced, but fewer proteins with secondary structures defined by laboratory methods. With the development of computer technology, computational methods have (started to) become the most important methodologies for predicting secondary structures. We evaluated two different approaches to this problem-driven by the recent results obtained by computational methods in this task-(i) template-free classifiers, based on machine learning techniques; and (ii) template-based classifiers, based on searching tools. Both approaches are formed by different sub-classifiers-six for template-free and two for template-based, each with a specific view of the protein. Our results show that these ensembles improve the results of each approach individually.

Keywords: BLAST; deep learning; ensemble; machine learning; protein secondary structure prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Difference in the number of sequenced proteins deposited in UniProtKB and the number of proteins with determined three-dimensional structures deposited in PDB.
Figure 2
Figure 2
Confusion matrix of the ensemble of template-free and template-based ensembles.
Figure 3
Figure 3
Confusion matrix of the ensemble of template-free and template-based ensembles.
Figure 4
Figure 4
General configuration of each RNN classifier.
Figure 5
Figure 5
Ensemble of RF classifiers.
Figure 6
Figure 6
General configuration of each inception-v4 block classifier.
Figure 7
Figure 7
General configuration of each IRN classifier.
Figure 8
Figure 8
BERT for protein secondary structure prediction task.
Figure 9
Figure 9
General configuration of each CNN classifier.
Figure 10
Figure 10
Class distribution on train, validation, and test set on the CB6133 dataset.
Figure 11
Figure 11
Class distribution on train, validation, and test set on the CB513 dataset.

Similar articles

Cited by

References

    1. Kumar P., Bankapur S., Patil N. An Enhanced Protein Secondary Structure Prediction using Deep Learning Framework on Hybrid Profile based Features. Appl. Soft Comput. 2020;86:105926. doi: 10.1016/j.asoc.2019.105926. - DOI
    1. Oliveira G.B., Pedrini H., Dias Z. Ensemble of Bidirectional Recurrent Networks and Random Forests for Protein Secondary Structure Prediction; Proceedings of the 27th International Conference on Systems, Signals and Image Processing (IWSSIP); Rio de Janeiro, Brazil. 1–3 July 2020; pp. 311–316.
    1. Oliveira G.B., Pedrini H., Dias Z. Protein Secondary Structure Prediction Based on Fusion of Machine Learning Classifiers; Proceedings of the 36th ACM/SIGAPP Symposium On Applied Computing—Bioinformatics Track (ACM SAC BIO); Gwangju, Korea. 22–26 March 2021; pp. 26–29.
    1. Cheng J., Liu Y., Ma Y. Protein Secondary Structure Prediction based on Integration of CNN and LSTM Model. J. Vis. Commun. Image Represent. 2020;71:102844. doi: 10.1016/j.jvcir.2020.102844. - DOI
    1. Cerri R., Mantovani R.G., Basgalupp M.P., de Carvalho A.C. Multi-label Feature Selection Techniques for Hierarchical Multi-label Protein Function Prediction; Proceedings of the International Joint Conference on Neural Networks (IJCNN); Rio de Janeiro, Brazil. 8–13 July 2018; pp. 1–7.