Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2006;7(4):R34.
doi: 10.1186/gb-2006-7-4-r34. Epub 2006 Apr 26.

The genome of Rhizobium leguminosarum has recognizable core and accessory components

Affiliations
Review

The genome of Rhizobium leguminosarum has recognizable core and accessory components

J Peter W Young et al. Genome Biol. 2006.

Abstract

Background: Rhizobium leguminosarum is an alpha-proteobacterial N2-fixing symbiont of legumes that has been the subject of more than a thousand publications. Genes for the symbiotic interaction with plants are well studied, but the adaptations that allow survival and growth in the soil environment are poorly understood. We have sequenced the genome of R. leguminosarum biovar viciae strain 3841.

Results: The 7.75 Mb genome comprises a circular chromosome and six circular plasmids, with 61% G+C overall. All three rRNA operons and 52 tRNA genes are on the chromosome; essential protein-encoding genes are largely chromosomal, but most functional classes occur on plasmids as well. Of the 7,263 protein-encoding genes, 2,056 had orthologs in each of three related genomes (Agrobacterium tumefaciens, Sinorhizobium meliloti, and Mesorhizobium loti), and these genes were over-represented in the chromosome and had above average G+C. Most supported the rRNA-based phylogeny, confirming A. tumefaciens to be the closest among these relatives, but 347 genes were incompatible with this phylogeny; these were scattered throughout the genome but were over-represented on the plasmids. An unexpectedly large number of genes were shared by all three rhizobia but were missing from A. tumefaciens.

Conclusion: Overall, the genome can be considered to have two main components: a 'core', which is higher in G+C, is mostly chromosomal, is shared with related organisms, and has a consistent phylogeny; and an 'accessory' component, which is sporadic in distribution, lower in G+C, and located on the plasmids and chromosomal islands. The accessory genome has a different nucleotide composition from the core despite a long history of coexistence.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The chromosome and six plasmids of Rlv3841. The plasmids are shown at the same relative scale, and the chromosome at one-fourth of that scale. Circles from outermost to innermost indicate genes in forward and reverse orientation: all genes, membrane proteins (bright green), conserved and unconserved hypotheticals (brown conserved, pale green unconserved), phage and transposons (pink, shown for pRL7 only), and (for the chromosome only) DNA transcription/restriction/helicases (red) and transcriptional regulators (blue). Inner circles indicate deviations in G+C content (black) and G-C skew (olive/maroon). The full list of Sanger Institute standard colors for functional categories is as follows: white = pathogenicity/adaptation/chaperones (shown here in black); dark grey = energy metabolism (glycolysis, electron transport, among others); red = information transfer (transcription/translation + DNA/RNA modification); bright green = surface (inner membrane, outer membrane, secreted, surface structures [lipopolysaccharide, among others]); and dark blue = stable RNA; turquoise = degradation of large molecules; pink/purple = degradation of small molecules; yellow = central/intermediary/miscellaneous metabolism; pale green = unknown; pale blue = regulators; orange/brown = conserved hypo; dark brown = pseudogenes and partial genes (remnants); light pink = phage/insertion sequence elements; light grey = some miscellaneous information (for example, Prosite) but no function. bp, base pairs; Rlv3841, R. leguminosarum biovar viciae strain 3841.
Figure 2
Figure 2
Distribution of functional classes of genes within replicons. The classes are based on those presented by Riley [86].
Figure 3
Figure 3
Protein-encoding genes on the chromosome and six plasmids of Rlv3841, showing their nucleotide composition. GC3s (G+C content of silent third positions of codons) is a sensitive measure of composition. Symbols indicate whether each gene encodes a quartop protein (with orthologs in A. tumefaciens, S. meliloti, and M. loti) and, if so, which phylogenetic topology it supports (RA-SM denotes the tree that pairs R. leguminosarum with A. tumefaciens, and S. meliloti with M. loti; RM-AS and RS-AM are similarly defined). In addition, the nodulation genes nodOTNMLEFDABCIJ are identified on pRL10. Rlv3841, R. leguminosarum biovar viciae strain 3841.
Figure 4
Figure 4
Detail of part of Figure 3, showing a chromosomal island. The island extends from 855 to 908 kilobases, genes RL0790-RL0841, and is recognizable by low GC3s (G+C content of silent third positions of codons) and absence of quartop genes. RA-SM denotes the tree that pairs R. leguminosarum with A. tumefaciens, and S. meliloti with M. loti; RM-AS and RS-AM are similarly defined.
Figure 5
Figure 5
Dinucleotide compositional analysis of 100-kilobase windows of the genomes of Rlv3841 and A. tumefaciens C58. On the first two axes of a principal components analysis of the symmetrized dinucleotide relative abundance (DRA) of both genomes analyzed jointly, sequences from each chromosome (chr) and plasmid are identified by distinct symbols. PC1 accounts for 48.9% and PC2 for 35.6% of the total variance. Rlv3841, R. leguminosarum biovar viciae strain 3841.
Figure 6
Figure 6
Cumulative distribution of the eight-base motif GGGCAGGG in the genome of Rlv3841. The motif is shown in forward and reverse orientation on chromosome and plasmids. Rlv3841, R. leguminosarum biovar viciae strain 3841.
Figure 7
Figure 7
Phylogeny of completely sequenced genomes of selected α-proteobacteria. The phylogeny is based on the concatenated sequences of 648 orthologous proteins. Neighbor-Joining method with % bootstrap support indicated. Scale indicates substitutions per site.
Figure 8
Figure 8
Genes on the chromosome and six plasmids of Rlv3841 encoding components of ABC transporter systems. The nucleotide composition (GC3s, i.e. G+C content of silent third positions of codons) of the components is shown. Members of the most abundant families are indicated, defined according to Saier [93]. ABC, ATP-binding cassette; Rlv3841, R. leguminosarum biovar viciae strain 3841.

References

    1. Gage DJ. Infection and invasion of roots by symbiotic, nitrogen-fixing rhizobia during nodulation of temperate legumes. Microbiol Mol Biol Rev. 2004;68:280–300. - PMC - PubMed
    1. Alsmark CM, Frank AC, Karlberg EO, Legault B-A, Ardell DH, Canback B, Eriksson A-S, Naslund AK, Handley SA, Huvet M, et al. The louse-borne human pathogen Bartonella quintana is a genomic derivative of the zoonotic agent Bartonella henselae. Proc Natl Acad Sci USA. 2004;101:9716–9721. - PMC - PubMed
    1. DelVecchio VG, Kapatral V, Redkar RJ, Patra G, Mujer C, Los T, Ivanova N, Anderson I, Bhattacharyya A, Lykidis A, et al. The genome sequence of the facultative intracellular pathogen Brucella melitensis. Proc Natl Acad Sci USA. 2002;99:443–448. - PMC - PubMed
    1. Jumas-Bilak E, Michaux-Charachon S, Bourg G, O'Callaghan D, Ramuz M. Differences in chromosome number and genome rearrangements in the genus Brucella. Mol Microbiol. 1998;27:99–106. - PubMed
    1. Paulsen IT, Seshadri R, Nelson KE, Eisen JA, Heidelberg JF, Read TD, Dodson RJ, Umayam L, Brinkac LM, Beanan MJ, et al. The Brucella suis genome reveals fundamental similarities between animal and plant pathogens and symbionts. Proc Natl Acad Sci USA. 2002;99:13148–13153. - PMC - PubMed

Publication types

MeSH terms