Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 27;24(1):447.
doi: 10.1186/s12859-023-05577-6.

AptaTrans: a deep neural network for predicting aptamer-protein interaction using pretrained encoders

Affiliations

AptaTrans: a deep neural network for predicting aptamer-protein interaction using pretrained encoders

Incheol Shin et al. BMC Bioinformatics. .

Abstract

Background: Aptamers, which are biomaterials comprised of single-stranded DNA/RNA that form tertiary structures, have significant potential as next-generation materials, particularly for drug discovery. The systematic evolution of ligands by exponential enrichment (SELEX) method is a critical in vitro technique employed to identify aptamers that bind specifically to target proteins. While advanced SELEX-based methods such as Cell- and HT-SELEX are available, they often encounter issues such as extended time consumption and suboptimal accuracy. Several In silico aptamer discovery methods have been proposed to address these challenges. These methods are specifically designed to predict aptamer-protein interaction (API) using benchmark datasets. However, these methods often fail to consider the physicochemical interactions between aptamers and proteins within tertiary structures.

Results: In this study, we propose AptaTrans, a pipeline for predicting API using deep learning techniques. AptaTrans uses transformer-based encoders to handle aptamer and protein sequences at the monomer level. Furthermore, pretrained encoders are utilized for the structural representation. After validation with a benchmark dataset, AptaTrans has been integrated into a comprehensive toolset. This pipeline synergistically combines with Apta-MCTS, a generative algorithm for recommending aptamer candidates.

Conclusion: The results show that AptaTrans outperforms existing models for predicting API, and the efficacy of the AptaTrans pipeline has been confirmed through various experimental tools. We expect AptaTrans will enhance the cost-effectiveness and efficiency of SELEX in drug discovery. The source code and benchmark dataset for AptaTrans are available at https://github.com/pnumlb/AptaTrans .

Keywords: Aptamer protein interaction; Pretraing; SELEX; Structural representation; Transformer.

PubMed Disclaimer

Conflict of interest statement

J.L. and H.Y.K. are full-time employees of Nuclixbio, a biopharmaceutical company that develops cell-penetrating biologics targeting intracellular oncoproteins with proprietary nanocarriers. This commercial affiliation does not alter our adherence to the BMC Bioinformatics policies on sharing data and materials. Other authors declared no competing interests.

Figures

Fig. 1
Fig. 1
Sequence tokenization using two algorithms. (A) k-mer algorithm for aptamer sequences. (B) FCS mining algorithm for protein sequences
Fig. 2
Fig. 2
Architecture overview of the proposed model, AptaTrans. A The AptaTrans architecture consists of four parts: tokenization, transformer-based encoders, convolution blocks, and a fully connected layer. In (A), an interaction matrix is generated by computing the dot products of the pairs between the RNA and amino acid token embedding vectors from the encoders. B Shows a transformer-based encoder that includes an embedding layer, a positional encoder, a vanilla transformer encoder, C convolution layers, and D a single convolution layer that includes batch normalization and an activation function
Fig. 3
Fig. 3
Example of pretraining techniques with two encoders. One is for the masked tokens prediction (top) and another for the secondary structure prediction (bottom)
Fig. 4
Fig. 4
Candidate aptamer generation process and its analysis using the AptaTrnas pipeline (including Apta-MCTS), RNA Composer, and ZDOCK Server
Fig. 5
Fig. 5
Performance comparison for API prediction in terms of six metrics: the ROC-AUC, accuracy (ACC), Matthews correlation coefficient (MCC), sensitivity (Sn), specificity (Sp), and F1-score. Our AptaTrans model was compared with PPAI [50] and Li et al.’s model [17]
Fig. 6
Fig. 6
ROC curves comparing AptaTrans performance of baseline and pretraining setups
Fig. 7
Fig. 7
Performance comparison for aptamer sequence recommendation
Fig. 8
Fig. 8
ZDOCK score comparisons between known aptamers and candidate aptamers for proteins 6GOF and 3NS6
Fig. 9
Fig. 9
Visualization of protein complex with known aptamers and top two candidate aptamers generated by AptaTrans pipeline for proteins 6GOF and 3SN6_4
Fig. 10
Fig. 10
Visualization of AptaTrans interaction maps for aptamer sequences and target proteins. X-axis and y-axis indicate protein and aptamer sequence tokens respectively. The tokens that show higher values than a selected threshold value in the interaction map are marked in bright color. This illustrates that that both known and candidate aptamer sequences exhibit high values in similar regions of the protein sequence
Fig. 11
Fig. 11
Motif analysis between known aptamer and candidate aptamers for two proteins, A 6GOF and B 3NS6
Fig. 12
Fig. 12
ELISA and ZDOCK simulation results for protein ‘glutamate carboxypeptidase2.’
Fig. 13
Fig. 13
Motif analysis between the PS202 aptamer and two candidate aptamers for the GCPII proteins

References

    1. Zhou J, Rossi J. Aptamers as targeted therapeutics: current potential and challenges. Nat Rev Drug Discov. 2017;16(3):181–202. doi: 10.1038/nrd.2016.199. - DOI - PMC - PubMed
    1. He J, Wang J, Zhang N, Shen L, Wang L, Xiao X, et al. In vitro selection of DNA aptamers recognizing drug-resistant ovarian cancer by cell-SELEX. Talanta. 2019;194:437–445. doi: 10.1016/j.talanta.2018.10.028. - DOI - PubMed
    1. Sun H, Zhu X, Lu PY, Rosato RR, Tan W, Zu Y. Oligonucleotide aptamers: new tools for targeted cancer therapy. Molecular Therapy-Nucleic Acids. 2014; 3. - PMC - PubMed
    1. Ning Y, Hu J, Lu F. Aptamers used for biosensors and targeted therapy. Biomed Pharmacother. 2020;132:110902. doi: 10.1016/j.biopha.2020.110902. - DOI - PMC - PubMed
    1. Ni S, Zhuo Z, Pan Y, Yu Y, Li F, Liu J, Wang L, Wu X, Li D, Wan Y, Zhang L. Recent progress in aptamer discoveries and modifications for therapeutic applications. ACS Appl Mater Interfaces. 2020;13(8):9500–9519. doi: 10.1021/acsami.0c05750. - DOI - PubMed

LinkOut - more resources