Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 25;6(4):e0053521.
doi: 10.1128/mSphere.00535-21. Epub 2021 Jul 21.

A Comprehensive Map of Mycobacterium tuberculosis Complex Regions of Difference

Affiliations

A Comprehensive Map of Mycobacterium tuberculosis Complex Regions of Difference

D Bespiatykh et al. mSphere. .

Abstract

Mycobacterium tuberculosis complex (MTBC) species are classic examples of genetically monomorphic microorganisms due to their low genetic variability. Whole-genome sequencing made it possible to describe both the main species within the complex and M. tuberculosis lineages and sublineages. This differentiation is based on single nucleotide polymorphisms (SNPs) and large sequence polymorphisms in the so-called regions of difference (RDs). Although a number of studies have been performed to elucidate RD localizations, their distribution among MTBC species, and their role in the bacterial life cycle, there are some inconsistencies and ambiguities in the localization of RDs in different members of the complex. To address this issue, we conducted a thorough search for all possible deletions in the WGS data collection comprising 721 samples representing the full MTBC diversity. Discovered deletions were compared with a list of all previously described RDs. As with the SNP-based analysis, we confirmed the specificities of 79 regions at the species, lineage, or sublineage level, 17 of which are described for the first time. We also present RDscan (https://github.com/dbespiatykh/RDscan), an open-source workflow, which detects deletions from short-read sequencing data and correlates the results with high-specificity RDs, curated in this study. Testing of the workflow on a collection comprising ∼7,000 samples showed a high specificity of the found RDs. This study provides novel details that can contribute to a better understanding of the species differentiation within the MTBC and can help to determine how individual clusters evolve within various MTBC species. IMPORTANCE Reductive genome evolution is one of the most important and intriguing adaptation strategies of different living organisms to their environment. Mycobacterium offers several notorious examples of either naturally reduced (Mycobacterium leprae) or laboratory-reduced (Mycobacterium bovis BCG) genomes. Mycobacterium tuberculosis complex has its phylogeny unambiguously framed by large sequence polymorphisms that present unidirectional unique event changes. In the present study, we curated all known regions of difference and analyzed both Mycobacterium tuberculosis and animal-adapted MTBC species. For 79 loci, we have shown a relationship with phylogenetic units, which can serve as a marker for diagnosing or studying biological effects. Moreover, intersections were found for some loci, which may indicate the nonrandomness of these processes and the involvement of these regions in the adaptation of bacteria to external conditions.

Keywords: MTBC; Mycobacterium tuberculosis complex; RD; comparative genomics; deletions; large sequence polymorphisms; regions of difference; structural variants.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Maximum-likelihood phylogeny of MTBC species. Maximum-likelihood phylogenetic tree of 721 genomes, inferred using 30,166 nonrecombinant core genome SNPs. The scale bar indicates the number of nucleotide substitutions per site. The tree is rooted on M. canettii (branch length is omitted).
FIG 2
FIG 2
Characteristics of deletions in MTBC samples. (a) Deletions per genome distribution among MTBC strains. Each point represents an individual sample. The y axis indicates the number of deletions per sample. The box represents the interquartile range that contains 50% of the values. A line across the box indicates the median. (b) Deletion length distribution among lineages. The y axis shows deletion length, and points represent outliers. The boxes indicate upper and lower quartiles, and the horizontal lines mark the medians. Whiskers indicate maximum and minimum values, excluding outliers. (c) Size distribution of deletions among all samples. The outlier peak is marked with an asterisk. L1 to L7, lineages 1 to 7.
FIG 3
FIG 3
RD distribution across main MTBC phylogenetic units. RDs present in M. tuberculosis H37Rv and absent in the studied lineages are in red (RvD1 and TbD1 are exceptions). The rows represent lineages within M. tuberculosis or MTBC species, and each column is a specific region of difference. Lineages and species in rows are arranged according to their phylogenetic relationship based on SNP analysis. RDs found in this study are marked with asterisks.
FIG 4
FIG 4
Overlapping RDs in different MTBC members. Deletions relative to the M. tuberculosis H37Rv genome are shown in green, blue, purple, and orange; gray arrows indicate genes. Overlapping RDs within Mycobacterium tuberculosis lineages (a), between M. tuberculosis (lineages 1 to 4 and lineage 7) and other MTBC members (b), and within M. tuberculosis lineage 5 and lineage 6 and the animal-adapted clade (c).
FIG 5
FIG 5
Ambiguous and interlineage RDs. Stacked bar plots showing the percentages of studied isolates with (blue) and without (gray) particular deletions. (a) Ambiguous RDs; (b) interlineage RDs.

Similar articles

Cited by

References

    1. Smith NH, Hewinson RG, Kremer K, Brosch R, Gordon SV. 2009. Myths and misconceptions: the origin and evolution of Mycobacterium tuberculosis. Nat Rev Microbiol 7:537–544. doi:10.1038/nrmicro2165. - DOI - PubMed
    1. Vasconcellos SEG, Huard RC, Niemann S, Kremer K, Santos AR, Suffys PN, Ho JL. 2010. Distinct genotypic profiles of the two major clades of Mycobacterium africanum. BMC Infect Dis 10:80. doi:10.1186/1471-2334-10-80. - DOI - PMC - PubMed
    1. Garnier T, Eiglmeier K, Camus JC, Medina N, Mansoor H, Pryor M, Duthoy S, Grondin S, Lacroix C, Monsempe C, Simon S, Harris B, Atkin R, Doggett J, Mayes R, Keating L, Wheeler PR, Parkhill J, Barrell BG, Cole ST, Gordon SV, Hewinson RG. 2003. The complete genome sequence of Mycobacterium bovis. Proc Natl Acad Sci U S A 100:7877–7882. doi:10.1073/pnas.1130426100. - DOI - PMC - PubMed
    1. van Soolingen D, Hoogenboezem T, de Haas PEW, Hermans PWM, Koedam MA, Teppema KS, Brennan PJ, Besra GS, Portaels F, Top J, Schouls LM, van Embden JDA. 1997. A novel pathogenic taxon of the Mycobacterium tuberculosis complex, Canetti: characterization of an exceptional isolate from Africa. Int J Syst Bacteriol 47:1236–1245. doi:10.1099/00207713-47-4-1236. - DOI - PubMed
    1. Niemann S, Richter E, Rüsch-Gerdes S. 2002. Biochemical and genetic evidence for the transfer of Mycobacterium tuberculosis subsp. caprae Aranaz et al. 1999 to the species Mycobacterium bovis Karlson and Lessel 1970 (approved list 1980) as Mycobacterium bovis subsp. caprae comb. nov. Int J Syst Evol Microbiol 52:433–436. doi:10.1099/00207713-52-2-433. - DOI - PubMed

Publication types