Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 19;8(7):e68731.
doi: 10.1371/journal.pone.0068731. Print 2013.

Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity

Affiliations

Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity

Tamara Smokvina et al. PLoS One. .

Abstract

Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to link the distribution pattern of a specific phenotype to the presence/absence of specific sets of genes.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors declare that TS, JvHV and CC work for Danone Research, part of the Danone Group. Danone is selling products that contain Lactobacilli. Danone Research financed part of this study, including subcontracting to NIZO food research (MW, JB) and Microbial Bioinformatics (RS). This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.

Figures

Figure 1
Figure 1. Pan-genome prediction.
The number of pan-genome OGs (blue) and core genome OGs (red) is shown as a function of genomes added to the pan-genome. OGs present in only one annotated genome were not included if they appeared to represent gene fragments or overpredicted small genes.
Figure 2
Figure 2. Genetic relatedness of strains.
(A) phylogenetic tree based on sequence similarity of 183 orthologous genes present in all strains; (B) pan-genome tree based on total genome content. Red  = dairy strains; green  = plant origin strains; black  = human/animal origin strains; blue  = unknown origin.
Figure 3
Figure 3. Genetic potential of L. paracasei to produce short branched-chain fatty acids. from branched-chain α-keto acids (BCKA).
(a) Organization of the bkd operon in the L. casei strains and genetic context in other lactobacilli. The functions encoded by bkd genes (yellow) are: Ptb, Phosphate butyryl-transferase; Buk, Butyrate kinase; BkdD, Dihydrolipoamide dehydrogenase; BkdA, 2-oxoisovalerate dehydrogenase a subunit; BkdB, 2-oxoisovalerate dehydrogenase b subunit; BkdC, Lipoamide acyltransferase component of BKDH complex; PanE, Ketopantoate reductase PanE/ApbA. The locus tags of the respective 8 bkd genes in the reference genomes are: LSEI_1441–1148 in L casei ATCC 334, LCABL_16640–16710 in L. casei BL23 and LCAZH_ 1429–1436 in L. casei Zhang. The black arrow and the stem-loop indicate a potential promoter and an ρ-independent terminator, respectively. The genetic environment around the bkd operon of L. casei is conserved among other lactobacilli: orthologous genes are shown by the same colour. PyrAB, Carbamoylphosphate synthase large subunit; PyrD, Dihydroorotate dehydrogenase PyrF; Orotidine-5′-phosphate decarboxylase; PyrE, Orotate phosphoribosyltransferase; FbpA, Fibronectin-binding protein, hypothetical protein LSEI_1438 (b) Branched-chain amino acids (BAA) catabolism to fatty acids adapted after . BAA are converted into BCKA via a BAA-amino transferase. The branched-chain α-keto acid dehydrogenase (BKDH) complex is composed of BkdA, BkdB, BkdC and BkdD.
Figure 4
Figure 4. Bar plot of OG presence/absence for the L. paracasei strains ordered according to the reference genomes.
This figure shows all pan-genome OGs found to be present (white bar) or absent (black bar) on the genomes. The box at the bottom contains OGs on contigs which are presumed plasmids. The pan-genome tree is shown at the top. The scale at the left represents pseudoassembly location relative to the reference genomes. A description of highly variable regions is shown at the right. The GC content is presented in the middle (wavy line), ranging from 30–60% (left to right).
Figure 5
Figure 5. Summary of sugar utilization cassettes.
Each row represents the presence (green) or absence (red) of a sugar utilization cassette in the strains listed at the top; D  = dairy origin, P  = plant origin, M  = mammalian origin, U  = unknown origin. The putative sugar(s) utilized, the type of transport system, and the number of genes in each cassette are listed in the last three rows. Group A, group B and group C strains refers to Figure2B. Chromosomal location: A  = cassette in sugar island A; B  = cassette in sugar island B.
Figure 6
Figure 6. Example of the GTM output.
The first column lists the sugar tested, and the second and third columns indicate the number of strains that grow (positive) or do not grow (negative) on that sugar. Relevant OGs and their annotation are listed in columns four and five. All coloured cells indicate OGs important for the classification of the specified phenotype (at top). Green cells indicate presence of the OG (>75%), red indicates absence of the OG (>75%). OGs that are important for the classification of the phenotype but are not present or absent in a large fraction of the strains are coloured black.

References

    1. Salminen S, von Wright A, Ouwehand AC (2004) Lactic Acid Bacteria: Microbiological and Functional Aspects. New York: Marcel Dekker, Inc.
    1. Holzapfel WH, Wood BJB (1998) The genera of lactic acid bacteria London Blackie Academic & Professional.
    1. Douillard FP, Ribbera A, Jarvinen HM, Kant R, Pietila TE, et al. (2013) Comparative Genomic and Functional Analysis of Lactobacillus casei and Lactobacillus rhamnosus Strains Marketed as Probiotics. Applied and Environmental Microbiology 79: 1923–1933. - PMC - PubMed
    1. de Vrese M, Schrezenmeir J (2008) Probiotics, prebiotics, and synbiotics. Adv Biochem Eng Biotechnol 111: 1–66. - PubMed
    1. Marchand J, Vandenplas Y (2000) Micro-organisms administered in the benefit of the host: myths and facts. Eur J Gastroenterol Hepatol 12: 1077–1088. - PubMed

Publication types