Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Apr;36(7):2406-17.
doi: 10.1093/nar/gkn043. Epub 2008 Feb 26.

PARTS: probabilistic alignment for RNA joinT secondary structure prediction

Affiliations

PARTS: probabilistic alignment for RNA joinT secondary structure prediction

Arif Ozgun Harmanci et al. Nucleic Acids Res. 2008 Apr.

Abstract

A novel method is presented for joint prediction of alignment and common secondary structures of two RNA sequences. The joint consideration of common secondary structures and alignment is accomplished by structural alignment over a search space defined by the newly introduced motif called matched helical regions. The matched helical region formulation generalizes previously employed constraints for structural alignment and thereby better accommodates the structural variability within RNA families. A probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities is utilized for scoring structural alignments. Maximum a posteriori (MAP) common secondary structures, sequence alignment and joint posterior probabilities of base pairing are obtained from the model via a dynamic programming algorithm called PARTS. The advantage of the more general structural alignment of PARTS is seen in secondary structure predictions for the RNase P family. For this family, the PARTS MAP predictions of secondary structures and alignment perform significantly better than prior methods that utilize a more restrictive structural alignment model. For the tRNA and 5S rRNA families, the richer structural alignment model of PARTS does not offer a benefit and the method therefore performs comparably with existing alternatives. For all RNA families studied, the posterior probability estimates obtained from PARTS offer an improvement over posterior probability estimates from a single sequence prediction. When considering the base pairings predicted over a threshold value of confidence, the combination of sensitivity and positive predictive value is superior for PARTS than for the single sequence prediction. PARTS source code is available for download under the GNU public license at http://rna.urmc.rochester.edu.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Example of a matched helical region. (a) shows pairing of nucleotides where bold lines represent hydrogen bonds. The fragments that make up matched helical regions are enclosed by dashed rectangles in (a). (b) shows the alignment of nucleotides in matched helical region, pairing of nucleotides are also shown in (b) by bold lines connecting base paired nucleotides. (b) illustrates alignment of base pairs, insertion of base pairs and alignment of base pairs to unpaired nucleotides.
Figure 2.
Figure 2.
Structural alignment of two hypothetical RNA sequences. Sequence alignment and secondary structures are shown in (a), (b) and (c). Matched helical regions are indicated in (b) and (c) inside colored rectangles. (d) illustrates joint representation of sequence alignment and common secondary structures.
Figure 3.
Figure 3.
PARTS algorithm input-output flowchart. The precomputed base pairing probabilities and precomputed alignment probabilities are input to algorithm. Joint posterior base pairing probabilities of individual sequences and joint posterior estimates of individual structures and alignment of sequences are output.
Figure 4.
Figure 4.
Sensitivity and PPV of structure (a) and alignment (b) prediction as a function of the weight parameter κ. The prediction accuracy is averaged over a training data set of 1000 tRNA pairs from the Sprinzl Database (32) and 1000 5S rRNA pairs from 5S Ribosomal RNA Database (27). κ = 0.45 maximizes structure prediction sensitivity. Structure prediction PPV increases asymptotically with increasing κ. Alignment prediction sensitivity and PPV both decrease slowly for κ > 1.0.
Figure 5.
Figure 5.
Structure and alignment prediction accuracy of five methods over the RNase P dataset. ‘Struct. Sensitivity’ and ‘Struct. PPV’ correspond to ‘Structure Sensitivity’ and ‘Structure PPV’, and ‘Align. Sensitivity’ and ‘Align. PPV’ correspond to ‘Alignment Sensitivity’ and ‘Alignment PPV’, respectively.
Figure 6.
Figure 6.
Structure and alignment prediction accuracies of seven methods over the tRNA dataset. The results are stratified with respect to percent sequence identity. ‘Struct. Sensitivity’ and ‘Struct. PPV’ correspond to ‘Structure Sensitivity’ and ‘Structure PPV’, and ‘Align. Sensitivity’ and ‘Align. PPV’ correspond to ‘Alignment Sensitivity’ and ‘Alignment PPV’, respectively.
Figure 7.
Figure 7.
Structure and alignment prediction accuracies of seven methods over the 5S rRNA dataset. The results are stratified with respect to percent sequence identity. ‘Struct. Sensitivity’ and ‘Struct. PPV’ correspond to ‘Structure Sensitivity’ and ‘Structure PPV’, and ‘Align. Sensitivity’ and ‘Align. PPV’ correspond to ‘Alignment Sensitivity’ and ‘Alignment PPV’, respectively.
Figure 8.
Figure 8.
PPV versus Sensitivity of predicted base pairs with changing pairing probability threshold (Pthresh) when PARTS and single sequence partition function is run over RNase P, tRNA and 5S rRNA datasets.
Figure 9.
Figure 9.
Known structures of RNase P sequences LGW17 (left) and SM-A05 (right).
Figure 10.
Figure 10.
Structures of RNase P sequences LGW17 and SM-A05, from the RNase P dataset, as predicted by PARTS and Dynalign. Heavy lines indicate the correctly predicted base pairs.

References

    1. Eddy SR. Non-coding RNA genes and the modern RNA world. Nat. Rev. 2001;2:919–929. - PubMed
    1. Eddy SR. Computational genomics of noncoding RNA genes. Cell. 2002;109:137–140. - PubMed
    1. Uzilov AV, Keegan JM, Mathews DH. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics. 2006;7:173. - PMC - PubMed
    1. Washietl S, Hofacker IL, Stadler PF. Fast and reliable prediction of noncoding RNAs. Proc. Natl Acad. Sci. USA. 2005;102:2454–2459. - PMC - PubMed
    1. Torarinsson E, Sawera M, Havgaard JH, Fredholm M, Gorodkin J. Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure. Genome Res. 2006;16:885–889. - PMC - PubMed

Publication types