Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 20;10(5):e0101024.
doi: 10.1128/msystems.01010-24. Epub 2025 Apr 8.

The presence of multiple variants of IncF plasmid alleles in a single genome sequence can hinder accurate replicon sequence typing using in silico pMLST tools

Affiliations

The presence of multiple variants of IncF plasmid alleles in a single genome sequence can hinder accurate replicon sequence typing using in silico pMLST tools

Michaela Ruzickova et al. mSystems. .

Abstract

IncF plasmids are mobile genetic elements found in bacteria from the Enterobacteriaceae family and often carry critical antibiotic and virulence gene cargo. The classification of IncF plasmids using the plasmid Multi-Locus Sequence Typing (pMLST) tool from the Center for Genomic Epidemiology (CGE; https://www.genomicepidemiology.org/) compares the sequences of IncF alleles against a database to create a plasmid sequence type (ST). Accurate identification of plasmid STs is useful as it enables an assessment of IncF plasmid lineages associated with pandemic enterobacterial STs. Our initial observations showed discrepancies in IncF allele variants reported by pMLST in a collection of 898 Escherichia coli ST131 genomes. To evaluate the limitations of the pMLST tool, we interrogated an in-house and public repository of 70,324 E. coli genomes of various STs and other Enterobacteriaceae genomes (n = 1247). All short-read assemblies and representatives selected for long-read sequencing were used to assess pMLST allele variants and to compare the output of pMLST tool versions. When multiple allele variants occurred in a single bacterial genome, the Python and web versions of the tool randomly selected one allele to report, leading to limited and inaccurate ST identification. Discrepancies were detected in 5,804 of 72,469 genomes (8.01%). Long-read sequencing of 27 genomes confirmed multiple IncF allele variants on one plasmid or two separate IncF plasmids in a single bacterial cell. The pMLST tool was unable to accurately distinguish allele variants and their location on replicons using short-read genome assemblies, or long-read genome assemblies if the same allele variant was present more than once.

Importance: Plasmid sequence type is crucial for describing IncF plasmids due to their capacity to carry important antibiotic and virulence gene cargo and consequently due to their association with disease-causing enterobacterial lineages exhibiting resistance to clinically relevant antibiotics in humans and food-producing animals. As a result, precise reporting of IncF allele variants in IncF plasmids is necessary. Comparison of the FAB formulae generated by the pMLST tool with annotated long-read genome assemblies identified inconsistencies, including examples where multiple IncF allele variants were present on the same plasmid but missing in the FAB formula, or in cases where two IncF plasmids were detected in one bacterial cell, and the pMLST output provided information only about one plasmid. Such inconsistencies may cloud interpretation of IncF plasmid replicon type in specific bacterial lineages or inaccurate assumptions of host strain clonality.

Keywords: Enterobacteriaceae; IncF; antibiotic resistance; pMLST; plasmids.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig 1
Fig 1
Presence of two FII alleles located on a single plasmid. Inconsistency between pMLST output of short-read assemblies analysis using three versions of the tool with one of the FII allele variants being omitted while creating the FAB formula (A) and the results of long-read assembly analysis showing presence of an F31/F36:A4:B1 plasmid carrying both F31 and F36 alleles with an incorrectly assigned plasmid ST (B).
Fig 2
Fig 2
Presence of two IncF plasmids located in one cell with an incorrect FAB formula identification. The discrepancies observed during the analysis of the short-read genome assemblies using three versions of pMLST, which resulted in different plasmid STs for the same isolate (A), and the results obtained from long-read sequencing showing two plasmids, F1:A2:B20 and F2:A-:B-, being carried by the cell and a correct assignment of FAB formula (B). The FIC allele shown in the output was not taken into consideration due to its identity being lower than 100.00% and its coverage reaching only to 75.00%.
Fig 3
Fig 3
Presence of an FII and FIC allele overlapping one another on a single plasmid. Conflicting results were obtained from short-read assemblies using CGE web pMLST versions, which showed the presence of the F18 allele, while the Docker/Anaconda versions omitted it completely (A). Long-read assembly analysis displayed the same discrepancy between versions and showed a complete plasmid carrying both C4 and F18 alleles with an overlap (B).
Fig 4
Fig 4
Discrepancies in the FIB allele variant observed between short-read and long-read analyses. Short-read analysis shows the presence of the B1 allele with 100% coverage and 100% identity due to the complete allele being present in the middle of a contig. Repeated pMLST analysis of long-read assembly provided two different alleles, B1 and B58, which proved to differ in a single nucleotide located 11 nucleotides from the start of the allele sequence. Due to the circular nature of the complete plasmid, the sequence begins from the start of the replication protein, causing the omission of the 29 bp region at the start of the allele, which contains the varying 11th nucleotide of the allele sequence. All three pMLST tool versions provided the same results; therefore, only the CGE web output is shown.
Fig 5
Fig 5
The most common allele variant combinations obtained by short-read sequencing of the complete collection of 5804 analyzed Enterobacteriaceae genomes carrying the multiple allele variants

References

    1. Carattoli A. 2009. Resistance plasmid families in Enterobacteriaceae. Antimicrob Agents Chemother 53:2227–2238. doi:10.1128/AAC.01707-08 - DOI - PMC - PubMed
    1. Johnson TJ, Nolan LK. 2009. Pathogenomics of the virulence plasmids of Escherichia coli. Microbiol Mol Biol Rev 73:750–774. doi:10.1128/MMBR.00015-09 - DOI - PMC - PubMed
    1. Osborn AM, da Silva Tatley FM, Steyn LM, Pickup RW, Saunders JR. 2000. Mosaic plasmids and mosaic replicons: evolutionary lessons from the analysis of genetic diversity in IncFII-related replicons. Microbiology (Reading) 146 (Pt 9):2267–2275. doi:10.1099/00221287-146-9-2267 - DOI - PubMed
    1. Mathers AJ, Peirano G, Pitout JDD. 2015. The role of epidemic resistance plasmids and international high-risk clones in the spread of multidrug-resistant Enterobacteriaceae. Clin Microbiol Rev 28:565–591. doi:10.1128/CMR.00116-14 - DOI - PMC - PubMed
    1. Villa L, García-Fernández A, Fortini D, Carattoli A. 2010. Replicon sequence typing of IncF plasmids carrying virulence and resistance determinants. J Antimicrob Chemother 65:2518–2529. doi:10.1093/jac/dkq347 - DOI - PubMed

LinkOut - more resources