Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jul 15;10(7):e0133260.
doi: 10.1371/journal.pone.0133260. eCollection 2015.

SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues

Affiliations

SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues

Xiaoxia Yang et al. PLoS One. .

Abstract

Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Flowchart of our SNBRFinder algorithm.
SNBRFinder is a sequence-based hybrid prediction algotirhm comprising a feature-based predictor SNBRFinderF and a template-based predictor SNBRFinderT. SNBRFinderF was built using the support vector machine algorithm whose inputs include comprehensive sequence descriptors and SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models.
Fig 2
Fig 2. HHscore distribution of optimal templates for DB312 and RB264.
The HHscore produced by HHblits ranges from 0 to 100% and is used to measure the similarity between the query sequence and its optimal template.
Fig 3
Fig 3. Comparison of our algorithms and existing approaches on three datasets.
(A) DB33, (B) RB49, (C) RB44. In (A) and (B), Accuracy1 = (TP+TN)/(TP+TN+FP+FN) and Accuracy2 = (Sensitivity+Specificity)/2. In (C), the AUC values of SNBRFinderT, HomPRIP, and PRBR are not provided, because the outputs of these three predictors are binary values. With the exception of SNBRFinder and RNABindRPlus (including the component predictors), the evaluation measures of the other approaches are derived from the recent review articles.
Fig 4
Fig 4. Distribution of the ratio of positive predictions for non-nucleic acid binding and nucleic acid binding proteins.
(A) NB250 annotated by our predictors trained with DB312, (B) NB250 annotated by our predictors trained with RB264. The solid bars represent the prediction results of non-nucleic acid binding sequences, while the hollow bars represent the prediction results of nucleic acid binding sequences.
Fig 5
Fig 5. Snapshots of SNBRFinder web server.
The submission page allows users to input mutiple protein sequences and specify the binding nucleic acid type. When the submitted job is finished, SNBRFinder will demonstrate the prediction results from three perspectives. The first section provides summary information about the query sequence and its optimal template. The second section is graphical representation of the prediction results. The last section includes details about the prediction results such as the outputs from our three predictors.

Similar articles

Cited by

References

    1. Chen Y, Varani G. Protein families and RNA recognition. FEBS J. 2005;272: 2088–97. - PubMed
    1. Gangloff S, Soustelle C, Fabre F. Homologous recombination is responsible for cell death in the absence of the Sgs1 and Srs2 helicases. Nature genetics. 2000;25: 192–4. - PubMed
    1. Ahmad S, Gromiha MM, Sarai A. Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics. 2004;20: 477–86. - PubMed
    1. Chen YC, Lim C. Predicting RNA-binding sites from the protein structure based on electrostatics, evolution and geometry. Nucleic Acids Res. 2008;36: e29 10.1093/nar/gkn008 - DOI - PMC - PubMed
    1. Chen YC, Sargsyan K, Wright JD, Huang YS, Lim C. Identifying RNA-binding residues based on evolutionary conserved structural and energetic features. Nucleic Acids Res. 2014;42: e15 10.1093/nar/gkt1299 - DOI - PMC - PubMed

Publication types

LinkOut - more resources