PScL-SDNNMAE: Protein Subcellular Localization Prediction Using Classical and Masked Autoencoder-Based Multi-View Features With Ensemble Feature Selection

doi:10.1109/TCBBIO.2025.3562809

. 2025 Jul-Aug;22(4):1606-1614.

doi: 10.1109/TCBBIO.2025.3562809.

PScL-SDNNMAE: Protein Subcellular Localization Prediction Using Classical and Masked Autoencoder-Based Multi-View Features With Ensemble Feature Selection

Shenjian Gu, Matee Ullah, Jiangning Song, Dong-Jun Yu

PMID: 40811330
DOI: 10.1109/TCBBIO.2025.3562809

PScL-SDNNMAE: Protein Subcellular Localization Prediction Using Classical and Masked Autoencoder-Based Multi-View Features With Ensemble Feature Selection

Shenjian Gu et al. IEEE Trans Comput Biol Bioinform. 2025 Jul-Aug.

. 2025 Jul-Aug;22(4):1606-1614.

doi: 10.1109/TCBBIO.2025.3562809.

Authors

Shenjian Gu, Matee Ullah, Jiangning Song, Dong-Jun Yu

PMID: 40811330
DOI: 10.1109/TCBBIO.2025.3562809

Abstract

Accurate prediction of protein subcellular localization is critical for understanding cellular functions and guiding drug design. However, current computational methods have limited and insufficient performance and as such, there exist few efficient vision learners based on self-supervised learning for extracting deep and informative features. To address it, we propose a novel bioimage-based method, termed PScL-SDNNMAE, to effectively predict the subcellular localizations of proteins in human cells. PScL-SDNNMAE first extracts classical features using traditional image descriptors. Next, the masked autoencoder (MAE) is first trained using the training image data and then used to extract the MAE-based deep features. In the feature selection phase, PScL-SDNNMAE applies the Analysis of Variance (ANOVA), Mutual Information (MI) and stepwise discriminant analysis (SDA) to select the optimal features from the classical feature sets. Finally, PScL-SDNNMAE trains the deep neural network (DNN) classifier using the super feature set generated by integrating all the classical optimal and MAE-based deep features. Extensive benchmark experiments including 10-fold cross-validation on the training dataset and independent test on the independent dataset illustrate more advanced performance and generalization capability of PScL-SDNNMAE than other existing state-of-the-art predictors. Moreover, the experiments also demonstrate the effectiveness of self-supervised learning methods in learning representations of IHC images, as well as the significant potential for pre-training on massive unlabeled datasets in the future.

PubMed Disclaimer

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

PScL-SDNNMAE: Protein Subcellular Localization Prediction Using Classical and Masked Autoencoder-Based Multi-View Features With Ensemble Feature Selection

PScL-SDNNMAE: Protein Subcellular Localization Prediction Using Classical and Masked Autoencoder-Based Multi-View Features With Ensemble Feature Selection

Authors

Abstract

Similar articles

Abstract

Similar articles

Related information