Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul 1;27(13):i85-93.
doi: 10.1093/bioinformatics/btr215.

IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming

Affiliations

IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming

Kengo Sato et al. Bioinformatics. .

Abstract

Motivation: Pseudoknots found in secondary structures of a number of functional RNAs play various roles in biological processes. Recent methods for predicting RNA secondary structures cover certain classes of pseudoknotted structures, but only a few of them achieve satisfying predictions in terms of both speed and accuracy.

Results: We propose IPknot, a novel computational method for predicting RNA secondary structures with pseudoknots based on maximizing expected accuracy of a predicted structure. IPknot decomposes a pseudoknotted structure into a set of pseudoknot-free substructures and approximates a base-pairing probability distribution that considers pseudoknots, leading to the capability of modeling a wide class of pseudoknots and running quite fast. In addition, we propose a heuristic algorithm for refining base-paring probabilities to improve the prediction accuracy of IPknot. The problem of maximizing expected accuracy is solved by using integer programming with threshold cut. We also extend IPknot so that it can predict the consensus secondary structure with pseudoknots when a multiple sequence alignment is given. IPknot is validated through extensive experiments on various datasets, showing that IPknot achieves better prediction accuracy and faster running time as compared with several competitive prediction methods.

Availability: The program of IPknot is available at http://www.ncrna.org/software/ipknot/. IPknot is also available as a web server at http://rna.naist.jp/ipknot/.

Contact: satoken@k.u-tokyo.ac.jp; ykato@is.naist.jp

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
A typical H-type pseudoknot. (a) A pseudoknot represented by three-loop nomenclature. Note that Loop 2 typically has zero or one base (Liu et al., 2010). (b) An arc representation of the pseudoknot.
Fig. 2.
Fig. 2.
An illustration of the decomposition of a pseudoknotted secondary structure y∈𝒮(x) into pseudoknot-free substructures (y(1),y(2),y(3)).
Fig. 3.
Fig. 3.
An illustration of the constraints of the IP formulation. The diagrams (a) and (b) correspond to the constraints (5) and (6), respectively. Note that at most one variable shown by a broken curved line can take a value 1. The diagram (c) corresponds to the constraint (7).
Fig. 4.
Fig. 4.
A schematic diagram of the iterative refinement algorithm for the base-pairing probability matrix. A constraint on secondary structure for each level is denoted by a variant of the dot-parenthesis format: a matching parenthesis ‘()’ denotes an allowed base pair, a character ‘x’ indicates an unpaired base, and a dot ‘.’ is used for an unconstrained base.
Fig. 5.
Fig. 5.
The PPV–Sensitivity plots of the experiment on the RS-pk388 dataset.
Fig. 6.
Fig. 6.
The PPV–Sensitivity plots of the experiment on the pk168 dataset.
Fig. 7.
Fig. 7.
The PPV–Sensitivity plots of the experiment on the Rfam-PK dataset. (a) The results for reference alignments. (b) The results for sequence-based alignments produced by Clustal W.

References

    1. Akutsu T. Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots. Discrete Appl. Math. 2000;104:45–62.
    1. Akutsu T. Recent advances in RNA secondary structure prediction with pseudoknots. Current Bioinform. 2006;1:115–129.
    1. Andronescu M., et al. Efficient parameter estimation for RNA secondary structure prediction. Bioinformatics. 2007;23:i19–i28. - PubMed
    1. Andronescu M., et al. RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinform. 2008;9:340. - PMC - PubMed
    1. Andronescu M., et al. Improved free energy parameters for RNA pseudoknotted secondary structure prediction. RNA. 2010a;16:26–42. - PMC - PubMed

Publication types