Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 21;51(5):e25.
doi: 10.1093/nar/gkac1253.

HybridRNAbind: prediction of RNA interacting residues across structure-annotated and disorder-annotated proteins

Affiliations

HybridRNAbind: prediction of RNA interacting residues across structure-annotated and disorder-annotated proteins

Fuhao Zhang et al. Nucleic Acids Res. .

Abstract

The sequence-based predictors of RNA-binding residues (RBRs) are trained on either structure-annotated or disorder-annotated binding regions. A recent study of predictors of protein-binding residues shows that they are plagued by high levels of cross-predictions (protein binding residues are predicted as nucleic acid binding) and that structure-trained predictors perform poorly for the disorder-annotated regions and vice versa. Consequently, we analyze a representative set of the structure and disorder trained predictors of RBRs to comprehensively assess quality of their predictions. Our empirical analysis that relies on a new and low-similarity benchmark dataset reveals that the structure-trained predictors of RBRs perform well for the structure-annotated proteins while the disorder-trained predictors provide accurate results for the disorder-annotated proteins. However, these methods work only modestly well on the opposite types of annotations, motivating the need for new solutions. Using an empirical approach, we design HybridRNAbind meta-model that generates accurate predictions and low amounts of cross-predictions when tested on data that combines structure and disorder-annotated RBRs. We release this meta-model as a convenient webserver which is available at https://www.csuligroup.com/hybridRNAbind/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Analysis of the cross-predictions and over-predictions based on TPR/CPR and TPR/OPR ratios measured on the test dataset. The predictions rely on thresholds where the specificity = 0.95. Predictors are sorted by their TPR/CPR values. Results for the disorder- and the structure-annotated proteins are in Supplementary Figure S3.
Figure 2.
Figure 2.
Flowchart of the HybridRNApred method.
Figure 3.
Figure 3.
TPR values (y-axis) in the function of the number of positions in the sequence between the evaluated residues and the nearest native RBRs (x-axis). TPRs are computed by assuming that putative RBRs that are within a given number of positions away from the native RBR are correct. We cover the best color-coded predictors selected based on Table 2 including HybridRNAbind (black), NCBRPred (orange), MTDsite (green) and DeepDISOBind (purple). The TPRs are based on two specificity-based thresholds of 0.9 (solid lines) and 0.95 (dashed lines).
Figure 4.
Figure 4.
Predictions of RBRs for the 60S ribosomal protein L28 (UniProt ID: P02406). Panel A visualizes the putative propensities and binary predictions where the horizontal axis corresponds to the protein sequence. The black horizontal bar below the axis shows annotations of the native RBRs. Results produced by different predictors are color-coded, where DeepDISOBind and NCBRPred are shown in blue and orange, respectively. Predictions from hybridRNAbind are encoded in green, red, yellow and gray for true positives (TPs), false positives (FPs), false negatives (FNs), and true negatives (TNs), respectively. The plots in the top panel show the putative propensity scores (solid color-coded lines) while the horizontal bars underneath give the corresponding binary predictions. Panel B show two sides of the corresponding structure of this protein in complex with RNA that is available in PDB (PDB ID: 4v88). The structures are drawn using Pymol. RNA is cropped to include all fragments that are in contact with this protein. Predictions from hybridRNAbind are color-coded in the proteins structure (green, red, yellow and gray) using the color schema described for panel A.

References

    1. Charoensawan V., Wilson D., Teichmann S.A.. Genomic repertoires of DNA-binding transcription factors across the tree of life. Nucleic Acids Res. 2010; 38:7364–7377. - PMC - PubMed
    1. Glisovic T., Bachorik J.L., Yong J., Dreyfuss G.. RNA-binding proteins and post-transcriptional gene regulation. FEBS Lett. 2008; 582:1977–1986. - PMC - PubMed
    1. Kelaini S., Chan C., Cornelius V.A., Margariti A.. RNA-Binding proteins hold key roles in function, dysfunction, and disease. Biology (Basel). 2021; 10:366. - PMC - PubMed
    1. wwPDB consortium Protein data bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2019; 47:D520–D528. - PMC - PubMed
    1. Yang J., Roy A., Zhang Y.. BioLiP: a semi-manually curated database for biologically relevant ligand–protein interactions. Nucleic Acids Res. 2012; 41:D1096–D1103. - PMC - PubMed

Publication types