Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 25;37(20):3654-3656.
doi: 10.1093/bioinformatics/btab247.

tSFM 1.0: tRNA Structure-Function Mapper

Affiliations

tSFM 1.0: tRNA Structure-Function Mapper

Travis J Lawrence et al. Bioinformatics. .

Abstract

Motivation: Structure-conditioned information statistics have proven useful to predict and visualize tRNA Class-Informative Features (CIFs) and their evolutionary divergences. Although permutation P-values can quantify the significance of CIF divergences between two taxa, their naive Monte Carlo approximation is slow and inaccurate. The Peaks-over-Threshold approach of Knijnenburg et al. (2009) promises improvements to both speed and accuracy of permutation P-values, but has no publicly available API.

Results: We present tRNA Structure-Function Mapper (tSFM) v1.0, an open-source, multi-threaded application that efficiently computes, visualizes and assesses significance of single- and paired-site CIFs and their evolutionary divergences for any RNA, protein, gene or genomic element sequence family. Multiple estimators of permutation P-values for CIF evolutionary divergences are provided along with confidence intervals. tSFM is implemented in Python 3 with compiled C extensions and is freely available through GitHub (https://github.com/tlawrence3/tSFM) and PyPI.

Availability and implementation: The data underlying this article are available on GitHub at https://github.com/tlawrence3/tSFM.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Slopegraph of three different P-value estimate algorithms for KLD CIF divergences of L.enriettii clade tRNA genes (n =160 genes) against human tRNA genes (n =431 genes). PECDF is the naive Monte Carlo method with 10 000 permutations per feature, using pseudo-counts when there are fewer than S exceedances (by default, 10). PECDF terminates after S exceedances (PECDF) or a maximum of 10 000 permutations, using pseudo-counts unless there are S exceedances. PGPD uses algorithm Approximate with T =500 target permutations and a maximum of 10 000 permutations. Colors show the harmonic mean of the number of sequences containing a given CIF in the two taxa. Points but not lines are jittered against overplotting. See online for color version.

References

    1. Benjamini Y., Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat Soc. Ser. B, 57, 289–300.
    1. Campbell B. et al. (2016) Application of the envelope peaks over threshold (EPOT) method for probabilistic assessment of dynamic stability. Ocean Eng., 120, 298–304.
    1. Collins-Hed A.I., Ardell D.H. (2019) Match fitness landscapes for macromolecular interaction networks: selection for translational accuracy and rate can displace tRNA-binding interfaces of non-cognate aminoacyl-tRNA synthetases. Theor. Popul. Biol., 129, 68–80. - PubMed
    1. Freyhult E. et al. (2006) Visualizing bacterial tRNA identity determinants and antideterminants using function logos and inverse function logos. Nucleic Acids Res., 34, 905–916. - PMC - PubMed
    1. Freyhult E. et al. (2007) New computational methods reveal tRNA identity element divergence between Proteobacteria and Cyanobacteria. Biochimie, 89, 1276–1288. - PubMed