Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 30;25(1):1019.
doi: 10.1186/s12864-024-10954-3.

GASIDN: identification of sub-Golgi proteins with multi-scale feature fusion

Affiliations

GASIDN: identification of sub-Golgi proteins with multi-scale feature fusion

Jianan Sui et al. BMC Genomics. .

Abstract

The Golgi apparatus is a crucial component of the inner membrane system in eukaryotic cells, playing a central role in protein biosynthesis. Dysfunction of the Golgi apparatus has been linked to neurodegenerative diseases. Accurate identification of sub-Golgi protein types is therefore essential for developing effective treatments for such diseases. Due to the expensive and time-consuming nature of experimental methods for identifying sub-Golgi protein types, various computational methods have been developed as identification tools. However, the majority of these methods rely solely on neighboring features in the protein sequence and neglect the crucial spatial structure information of the protein.To discover alternative methods for accurately identifying sub-Golgi proteins, we have developed a model called GASIDN. The GASIDN model extracts multi-dimension features by utilizing a 1D convolution module on protein sequences and a graph learning module on contact maps constructed from AlphaFold2.The model utilizes the deep representation learning model SeqVec to initialize protein sequences. GASIDN achieved accuracy values of 98.4% and 96.4% in independent testing and ten-fold cross-validation, respectively, outperforming the majority of previous predictors. To the best of our knowledge, this is the first method that utilizes multi-scale feature fusion to identify and locate sub-Golgi proteins. In order to assess the generalizability and scalability of our model, we conducted experiments to apply it in the identification of proteins from other organelles, including plant vacuoles and peroxisomes. The results obtained from these experiments demonstrated promising outcomes, indicating the effectiveness and versatility of our model. The source code and datasets can be accessed at https://github.com/SJNNNN/GASIDN .

Keywords: AlphaFold2; Deep representation learning; Graph neural network; Multi-scale features; Sub-Golgi proteins; TextCNN.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
The overall framework of the GASIDN model
Fig. 2
Fig. 2
ELMo-based SeqVec architecture diagram
Fig. 3
Fig. 3
The architecture of local sequence feature extraction module
Fig. 4
Fig. 4
ROC curve of models on the independent test set
Fig. 5
Fig. 5
PR curve of models on the independent test set
Fig. 6
Fig. 6
ROC curve of models on the ten-fold cross-validation
Fig. 7
Fig. 7
PR curve of models on the ten-fold cross-validation
Fig. 8
Fig. 8
Effect of parameter γ on model performance
Fig. 9
Fig. 9
Comparison of weighted fusion (γ = 0.5) and cascade fusion model performance
Fig. 10
Fig. 10
Effect of multi-scale features on prediction accuracy. 1D features: one-dimensional local sequence features. 3D features: three-dimensional spatial structure features. Multi-scale features: Multi-scale features that combine both types of features

Similar articles

References

    1. Hoyer S. Is sporadic Alzheimer disease the brain type of non-insulin dependent diabetes mellitus? A challenging hypothesis. J Neural Transm. 1998;105(4):415–22. - PubMed
    1. Rose DR. Structure, mechanism and inhibition of Golgi α-mannosidase II. Curr Opin Struct Biol. 2012;22(5):558–62. - PubMed
    1. Gonatas N, Gonatas JO, Stieber A. The involvement of the Golgi apparatus in the pathogenesis of amyotrophic lateral sclerosis, Alzheimer’s disease, and ricin intoxication. Histochem Cell Biol. 1998;109(5):591–600. - PubMed
    1. Yang W, Zhu X-J, Huang J, Ding H, Lin H. A brief survey of machine learning methods in protein sub-Golgi localization. Curr Bioinform. 2019;14(3):234–40.
    1. Wang Z, Ding H, Zou Q. Identifying cell types to interpret scRNA-seq data: how, why and more possibilities. Brief Funct Genomics. 2020;19(4):286–91. - PubMed

LinkOut - more resources