Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 10;2(1):vbac056.
doi: 10.1093/bioadv/vbac056. eCollection 2022.

ExplorePipolin: reconstruction and annotation of piPolB-encoding bacterial mobile elements from draft genomes

Affiliations

ExplorePipolin: reconstruction and annotation of piPolB-encoding bacterial mobile elements from draft genomes

L Chuprikova et al. Bioinform Adv. .

Abstract

Motivation: Detailed and accurate analysis of mobile genetic elements (MGEs) in bacteria is essential to deal with the current threat of multiresistant microbes. The overwhelming use of draft, contig-based genomes hinder the delineation of the genetic structure of these plastic and variable genomic stretches, as in the case of pipolins, a superfamily of MGEs that spans diverse integrative and plasmidic elements, characterized by the presence of a primer-independent DNA polymerase.

Results: ExplorePipolin is a Python-based pipeline that screens for the presence of the element and performs its reconstruction and annotation. The pipeline can be used on virtually any genome from diverse organisms and of diverse quality, obtaining the highest-scored possible structure and reconstructed out of different contigs if necessary. Then, predicted pipolin boundaries and pipolin encoded genes are subsequently annotated using a custom database, returning the standard file formats suitable for comparative genomics of this mobile element.

Availability and implementation: All code is available and can be accessed here: github.com/pipolinlab/ExplorePipolin.

Supplementary information: Supplementary data are available at Bioinformatics Advances online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Workflow of ExplorePipolin. Main nodes and tasks are indicated. Dataflow throughout the pipeline is managed with Prefect (https://www.prefect.io/)
Fig. 2.
Fig. 2.
Featured-based mobile element reconstruction from draft genomes. (A) Description of the scoring applied to possible pipolin reconstructions. First, (i) the score of each fragment (Sifrag) is calculated as a coefficient multiplied by the number of features. The coefficient depends on the feature present in the fragment: 1000 for piPolB surrounded by atts, 100 for piPolB with atts from one side, 10 when piPolB only is present and 1 for atts only. Then, (ii) the score of each alternative predicted pipolin (Spipolin) is a tuple (i.e. an ordered pair) of two numbers: (1) the maximal score among all fragments and (2) the sum of the scores of all fragments. (B) Scoring of alternative pipolin reconstructions. Features are linked with a dashed line as reconstruction may use features from one or multiple contigs. The maximal score of the generated pipolin alternatives is indicated. In the simplest example (1), when the piPolB and two att sequences can be identified on the same contig, reconstruction will lead to the P4 alternative. In case of fragmented pipolin, reconstruction will lead to P3, P2 or P1 alternative respective to the location of disruptions. As indicated with rightwards arrows, the orientation of the piPolB and the categorization of atts as attL and attR is established as detailed in the text. In more complex scenario (2), several pipolins may be possible. In this case, P6 has the highest score (3000, 3000). After choosing P6 and removing overlapping variants, P4 would be included in the output as a second pipolin, containing only piPolB. In the case (3), four pipolins are possible, but the P4 reaches the highest score by the sum of the features included into the fragment. (C) Examples of most common reconstructed output pipolins, containing or not the atts and the overlapping tRNAs. The black arrowhead denotes an assembly gap introduced after the reconstruction step in order to create a single pipolin out of individual fragments located on different contigs
Fig. 3.
Fig. 3.
Genetic structure of diverse pipolins from genomes from a wide range of bacteria reconstructed by ExplorePipolin. Predicted protein-coding genes are represented by arrows, indicating the direction of transcription and more common genes are colored following Prokka annotation (piPolB in red, tyrosine recombinase in brown, UDG in cyan, excisionase in purple and metallohydrolase in magenta). When detected, other features are indicated as colored arrowheads: sequence gaps (black), E.coli related-atts (navy blue), de novo detected direct repeats (steel) and tRNAs (green). The grayscale on the right reflects the percent of amino acid identity between pairs of sequences. The image was generated by EasyFig software using tBlastX for elements comparison. Selected genome assemblies were downloaded from GenBank database (IDs GCA_020905835.1, GCA_000700265.1, GCA_917083495.1, GCA_013377875.1, GCA_015790535.1, GCA_009183495.1, GCA_003788595.2, GCA_016628745.1, GCA_016859455.1 and GCA_003703875.1). Names of the analyzed genomes are indicated on the left and colored by taxonomy: magenta, Alphaproteobacteria; red, Gammaproteobacteria; purple, fungi, forest green, Actinobacteria; and green, Firmicutes

Similar articles

Cited by

References

    1. Alvarado A. et al. (2012) A degenerate primer MOB typing (DPMT) method to classify Gamma-Proteobacterial plasmids in clinical and environmental settings. PLoS One, 7, e40438. - PMC - PubMed
    1. Arndt D. et al. (2019) PHAST, PHASTER and PHASTEST: tools for finding prophage in bacterial genomes. Brief. Bioinform., 20, 1560–1567. - PMC - PubMed
    1. Arredondo-Alonso S. et al. (2017) On the (im)possibility of reconstructing plasmids from whole-genome short-read sequencing data. Microb. Genom., 3, e000128. - PMC - PubMed
    1. Benler S. et al. (2021) Cargo genes of Tn7-like transposons comprise an enormous diversity of defense systems, mobile genetic elements, and antibiotic resistance genes. mBio, 12, e0293821. - PMC - PubMed
    1. Brack P. et al. (2022) Ten simple rules for making a software tool workflow-ready. PLoS Comput. Biol., 18, e1009823. - PMC - PubMed