Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 26;6(2):veaa068.
doi: 10.1093/ve/veaa068. eCollection 2020 Jul.

PuMA: A papillomavirus genome annotation tool

Affiliations

PuMA: A papillomavirus genome annotation tool

Josh Pace et al. Virus Evol. .

Abstract

High-throughput sequencing technologies provide unprecedented power to identify novel viruses from a wide variety of (environmental) samples. The field of 'viral metagenomics' has dramatically expanded our understanding of viral diversity. Viral metagenomic approaches imply that many novel viruses will not be described by researchers who are experts on (the genomic organization of) that virus family. We have developed the papillomavirus annotation tool (PuMA) to provide researchers with a convenient and reproducible method to annotate and report novel papillomaviruses. PuMA currently correctly annotates 99% of the papillomavirus genes when benchmarked against the 655 reference genomes in the papillomavirus episteme. Compared to another viral annotation pipeline, PuMA annotates more viral features while being more accurate. To demonstrate its general applicability, we also developed a preliminary version of PuMA that can annotate polyomaviruses. PuMA is available on GitHub (https://github.com/KVD-lab/puma) and through the iMicrobe online environment (https://www.imicrobe.us/#/apps/puma).

Keywords: annotation; high-throughput sequencing; metagenomics; papillomavirus; polyomavirus; virome.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Screenshot of PuMA submission form on iMicrobe.
Figure 2.
Figure 2.
(A) Flowchart illustrating the PuMA algorithm: see the methods section for more details. (B) PuMA correctly annotates 99% (blue) of manual annotations present on the PaVE database. PuMA annotations were compared to the manually curated genomes in the PaVE database. The previously described VAPiD algorithm correctly identifies 79% of all annotations, wrongly annotates 15%, and did not identify 6% of possible annotations (C) proteins that were wrongly identified by PuMA v1.2.1 are plotted.

References

    1. Altschul S. F. et al. (1990) ‘Basic Local Alignment Search Tool’, Journal of Molecular Biology, 215: 403–10. - PubMed
    1. Altschul S. F. et al. (2005) ‘Protein Database Searches Using Compositionally Adjusted Substitution Matrices’, FEBS Journal, 272: 5101–9. - PMC - PubMed
    1. Bailey T. L., Elkan C. (1994) ‘Fitting a Mixture Model by Expectation Maximization to Discover Motifs in Biopolymers’, Proceedings International Conference on Intelligent Systems for Molecular Biology, 2: 28–36. - PubMed
    1. Bergvall M., Melendy T., Archambault J. (2013) ‘The E1 Proteins’, Virology, 445: 35–56. - PMC - PubMed
    1. Bernard H.-U. et al. (2010) ‘Classification of Papillomaviruses (PVs) Based on 189 PV Types and Proposal of Taxonomic Amendments’, Virology, 401: 70–9. - PMC - PubMed