MHTAPred-SS: A Highly Targeted Autoencoder-Driven Deep Multi-Task Learning Framework for Accurate Protein Secondary Structure Prediction
- PMID: 39769208
- PMCID: PMC11677681
- DOI: 10.3390/ijms252413444
MHTAPred-SS: A Highly Targeted Autoencoder-Driven Deep Multi-Task Learning Framework for Accurate Protein Secondary Structure Prediction
Abstract
Accurate protein secondary structure prediction (PSSP) plays a crucial role in biopharmaceutics and disease diagnosis. Current prediction methods are mainly based on multiple sequence alignment (MSA) encoding and collaborative operations of diverse networks. However, existing encoding approaches lead to poor feature space utilization, and encoding quality decreases with fewer homologous proteins. Moreover, the performance of simple stacked networks is greatly limited by feature extraction capabilities and learning strategies. To this end, we propose MHTAPred-SS, a novel PSSP framework based on the fusion of six features, including the embedding feature derived from a pre-trained protein language model. First, we propose a highly targeted autoencoder (HTA) as the driver to encode sequences in a homologous protein-independent manner. Second, under the guidance of biological knowledge, we design a protein secondary structure prediction model based on the multi-task learning strategy (PSSP-MTL). Experimental results on six independent test sets show that MHTAPred-SS achieves state-of-the-art performance, with values of 88.14%, 84.89%, 78.74% and 77.15% for Q3, SOV3, Q8 and SOV8 metrics on the TEST2016 dataset, respectively. Additionally, we demonstrate that MHTAPred-SS has significant advantages in single-category and boundary secondary structure prediction, and can finely capture the distribution of secondary structure segments, thereby contributing to subsequent tasks.
Keywords: deep multi-task learning; highly targeted autoencoder; multi-feature fusion; pre-trained protein language model; protein secondary structure prediction.
Conflict of interest statement
The authors declare no conflicts of interest.
Figures















Similar articles
-
MFTrans: A multi-feature transformer network for protein secondary structure prediction.Int J Biol Macromol. 2024 May;267(Pt 1):131311. doi: 10.1016/j.ijbiomac.2024.131311. Epub 2024 Apr 9. Int J Biol Macromol. 2024. PMID: 38599417
-
Porter 6: Protein Secondary Structure Prediction by Leveraging Pre-Trained Language Models (PLMs).Int J Mol Sci. 2024 Dec 27;26(1):130. doi: 10.3390/ijms26010130. Int J Mol Sci. 2024. PMID: 39795988 Free PMC article.
-
PSSP-MVIRT: peptide secondary structure prediction based on a multi-view deep learning architecture.Brief Bioinform. 2021 Nov 5;22(6):bbab203. doi: 10.1093/bib/bbab203. Brief Bioinform. 2021. PMID: 34117740
-
Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences.Brief Bioinform. 2020 Sep 25;21(5):1676-1696. doi: 10.1093/bib/bbz112. Brief Bioinform. 2020. PMID: 31714956 Review.
-
Deep learning for protein secondary structure prediction: Pre and post-AlphaFold.Comput Struct Biotechnol J. 2022 Nov 11;20:6271-6286. doi: 10.1016/j.csbj.2022.11.012. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 36420164 Free PMC article. Review.
Cited by
-
Comprehensive assessment of AlphaFold's predictions of secondary structure and solvent accessibility at the amino acid-level in eukaryotic, bacterial and archaeal proteins.Comput Struct Biotechnol J. 2025 May 29;27:2443-2449. doi: 10.1016/j.csbj.2025.05.047. eCollection 2025. Comput Struct Biotechnol J. 2025. PMID: 40535106 Free PMC article.
References
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources