Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 15;23(Suppl 6):575.
doi: 10.1186/s12859-023-05362-5.

Automatic generation of pseudoknotted RNAs taxonomy

Affiliations

Automatic generation of pseudoknotted RNAs taxonomy

Michela Quadrini et al. BMC Bioinformatics. .

Abstract

Background: The ability to compare RNA secondary structures is important in understanding their biological function and for grouping similar organisms into families by looking at evolutionarily conserved sequences such as 16S rRNA. Most comparison methods and benchmarks in the literature focus on pseudoknot-free structures due to the difficulty of mapping pseudoknots in classical tree representations. Some approaches exist that permit to cluster pseudoknotted RNAs but there is not a general framework for evaluating their performance.

Results: We introduce an evaluation framework based on a similarity/dissimilarity measure obtained by a comparison method and agglomerative clustering. Their combination automatically partition a set of molecules into groups. To illustrate the framework we define and make available a benchmark of pseudoknotted (16S and 23S) and pseudoknot-free (5S) rRNA secondary structures belonging to Archaea, Bacteria and Eukaryota. We also consider five different comparison methods from the literature that are able to manage pseudoknots. For each method we clusterize the molecules in the benchmark to obtain the taxa at the rank phylum according to the European Nucleotide Archive curated taxonomy. We compute appropriate metrics for each method and we compare their suitability to reconstruct the taxa.

Keywords: Agglomerative clustering; Benchmark; Evaluation framework; RNA comparison methods; RNA secondary structures.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
On the top, an RNA secondary structure illustrated via an arc-diagram. The motif in part a is pseudoknot-free, while the one in part b is pseudoknotted. Pseudoknots are clearly visible as crossings of arcs. On the bottom, the three feasible relations: c concatenation, d nesting and e crossing of two bonds
Fig. 2
Fig. 2
Execution of the evaluation framework on a set of molecules. Required input is: a list of the molecules with the corresponding taxa labels at a chosen taxonomy rank of a curated taxonomy (1); a CSV file with the distances between all pairs of molecules in the set, computed with a selected comparison method (2). Agglomerative clustering is applied using the given distances to generate a number of clusters equal to the number of different input taxa labels. The result is a list of molecules with assigned cluster labels (3). Metrics to evaluate how well the generated cluster labels match the original ones are computed and outputted (4)

References

    1. Li B, Cao Y, Westhof E, Miao Z. Advances in RNA 3D structure modeling using experimental data. Front Genet. 2020;11:1147. doi: 10.3389/fgene.2020.574485. - DOI - PMC - PubMed
    1. Atkinson HJ, Morris JH, Ferrin TE, Babbitt PC. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies. PLoS ONE. 2009;4(2):4345. doi: 10.1371/journal.pone.0004345. - DOI - PMC - PubMed
    1. Hochsmann M, Voss B, Giegerich R. Pure multiple RNA secondary structure alignments: a progressive profile approach. IEEE/ACM Trans Comput Biol Bioinf. 2004;1(1):53–62. doi: 10.1109/TCBB.2004.11. - DOI - PubMed
    1. Reiter NJ, Chan CW, Mondragón A. Emerging structural themes in large RNA molecules. Curr Opin Struct Biol. 2011;21(3):319–326. doi: 10.1016/j.sbi.2011.03.003. - DOI - PMC - PubMed
    1. Linnaeus C. Systema Naturae vol. 1. Stockholm Laurentii Salvii, Stockholm 1758.