Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Apr 28;33(8):2433-9.
doi: 10.1093/nar/gki541. Print 2005.

A benchmark of multiple sequence alignment programs upon structural RNAs

Affiliations

A benchmark of multiple sequence alignment programs upon structural RNAs

Paul P Gardner et al. Nucleic Acids Res. .

Abstract

To date, few attempts have been made to benchmark the alignment algorithms upon nucleic acid sequences. Frequently, sophisticated PAM or BLOSUM like models are used to align proteins, yet equivalents are not considered for nucleic acids; instead, rather ad hoc models are generally favoured. Here, we systematically test the performance of existing alignment algorithms on structural RNAs. This work was aimed at achieving the following goals: (i) to determine conditions where it is appropriate to apply common sequence alignment methods to the structural RNA alignment problem. This indicates where and when researchers should consider augmenting the alignment process with auxiliary information, such as secondary structure and (ii) to determine which sequence alignment algorithms perform well under the broadest range of conditions. We find that sequence alignment alone, using the current algorithms, is generally inappropriate <50-60% sequence identity. Second, we note that the probabilistic method ProAlign and the aging Clustal algorithms generally outperform other sequence-based algorithms, under the broadest range of applications.

PubMed Disclaimer

Figures

Figure 1
Figure 1
An overview of alignment programs used in this work. Programs were classified into the categories described in detail in Alignment algorithms section.
Figure 2
Figure 2
Both measures of structural RNA alignment correctness, SCI (A) and SPS (B), are plotted as functions of the mean pairwise sequence identity (calculated using the reference alignments). The curves are fit to dataset 1 (see text for details) using lowess (local weighted regression) smoothing. At most, two curves are plotted for each alignment package—one corresponding to the default parameters, the other corresponds to the best parameter combination we could identify.
Figure 3
Figure 3
SCI (A) and SPS (B) as functions of the sequence identity for dataset 2 (see text for details). Five structural algorithms are shown: the Sankoff-based methods Dynalign, Foldalign, PMcomp and Stemloc, and the base pair probability profile alignment method implemented in PMcomp (fast). These are in contrast with the hand-curated structural alignments and six of the better sequence-based alignment algorithms (ClustalW, MUSCLE, PCMA, POA (gp), ProAlign and Prrn).

References

    1. Chiu D.K., Kolodziejczak T. Inferring consensus structure from nucleic acid sequences. Comput. Appl. Biosci. 1991;7:347–352. - PubMed
    1. Gutell R.R., Power A., Hertz G.Z., Putz E.J., Stormo G.D. Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. Nucleic Acids Res. 1992;20:5785–5795. - PMC - PubMed
    1. Gorodkin J., Heyer L., Brunak S., Stormo G. Displaying the information contents of structural RNA alignments. CABIOS. 1997;13:583–586. - PubMed
    1. Knudsen B., Hein J. Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res. 2003;31:3423–3428. - PMC - PubMed
    1. Hofacker I., Fekete M., Stadler P. Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 2002;319:1059–1066. - PubMed

Publication types