Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004;5(4):R27.
doi: 10.1186/gb-2004-5-4-r27. Epub 2004 Mar 18.

Comparative genomics of gene-family size in closely related bacteria

Affiliations
Comparative Study

Comparative genomics of gene-family size in closely related bacteria

Ravindra Pushker et al. Genome Biol. 2004.

Abstract

Background: The wealth of genomic data in bacteria is helping microbiologists understand the factors involved in gene innovation. Among these, the expansion and reduction of gene families appears to have a fundamental role in this, but the factors influencing gene family size are unclear.

Results: The relative content of paralogous genes in bacterial genomes increases with genome size, largely due to the expansion of gene family size in large genomes. Bacteria undergoing genome reduction display a parallel process of redundancy elimination, by which gene families are reduced to one or a few members. Gene family size is also influenced by sequence divergence and physiological function. Large gene families show wider sequence divergence, suggesting they are probably older, and certain functions (such as metabolite transport mechanisms) are overrepresented in large families. The size of a given gene family is remarkably similar in strains of the same species and in closely related species, suggesting that homologous gene families are vertically transmitted and depend little on horizontal gene transfer (HGT).

Conclusions: The remarkable preservation of copy numbers in widely different ecotypes indicates a functional role for the different copies rather than simply a back-up role. When different genera are compared, the increase in phylogenetic distance and/or ecological specialization disrupts this preservation, albeit in a gradual manner and maintaining an overall similarity, which also supports this view. HGT can have an important role, however, in nonhomologous gene families, as exemplified by a comparison between saprophytic and enterohemorrhagic strains of Escherichia coli.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Relationship between percentage of genes belonging to paralogous families plotted versus genome size in 127 eubacterial genomes. Inset shows the average gene family size versus genome size for the same genomes, except Shigella flexneri, Bordetella pertusis, B. parapertussis and B. bronchiseptica, which contain a high number of IS elements. Some genomes with atypical values are identified: Mpn, Mycoplasma pneumoniae; Mpt, Mycoplasma penetrans; Mga, Mycoplasma gallisepticum; Mlp, Mycobacterium leprae; Pir, Pirellula sp.
Figure 2
Figure 2
Gene family sizes in genomes undergoing reductive evolution compared to a phylogenetically related larger sequenced genome. (a) Mycobacterium leprae (reductive) vs Mycobacterium tuberculosis H37Rv; (b) Shigella flexneri (reductive) vs Escherichia coli K12. Orthologous genes in the genome pairs (identified by amino-acid sequence similarity) are displayed in arbitrary order and plotted against the number of homologs in their own genome (that is, paralogs). Only protein-coding genes are included. IS elements from S. flexneri 2a are excluded.
Figure 3
Figure 3
The number of members in E. coli K12 gene families plotted versus mean sequence identity of pairwise comparisons among the members of each family.
Figure 4
Figure 4
Gene family sizes for homologous genes in groups of strains belonging to the same species, represented as in Figure 2. (a) Chlamydophila pneumoniae strains; (b) Streptococcus pyogenes strains; (c) Escherichia coli strains; (d) Staphylococcus aureus strains. Strain denomination and graph code displayed in the top right-hand corner. Only protein-coding genes are included. Zero on the y-axis indicates single-copy genes; 1 indicates a gene family formed of two members.
Figure 5
Figure 5
Gene family sizes for homologous protein-coding genes in different species of the same genus. (a) Pseudomonas spp; (b) Bacillus spp. (c) Difference in the size of equivalent gene families between E. coli K12 and S. typhimurium LT2. Positive values indicate larger families in E. coli; negative values indicate larger families in S. typhymurium. The potG gene family is indicated.
Figure 6
Figure 6
Proportions of assigned functions among genes belonging to families and singletons in B. subtilis and E. coli K12. Gene functions were assigned according to the Cluster of Orthologous Genes (COGs) classification [41]. Extended gene families are considered, in which a gene belongs to a single family only (see Materials and methods).

Similar articles

Cited by

References

    1. Ohno S. Evolution by Gene Duplication. New York: Springer; 1970.
    1. Gogarten JP, Olendzenski L. Orthologs, paralogs and genome comparisons. Curr Opin Genet Dev. 1999;9:630–636. doi: 10.1016/S0959-437X(99)00029-5. - DOI - PubMed
    1. Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277:1453–1474. doi: 10.1126/science.277.5331.1453. - DOI - PubMed
    1. Liang P, Labedan B, Riley M. Physiological genomics of Escherichia coli protein families. Physiol Genomics. 2002;9:15–26. - PubMed
    1. Hooper SD, Berg OG. Duplication is more common among laterally transferred genes than among indigenous genes. Genome Biol. 2003;4:R48. doi: 10.1186/gb-2003-4-8-r48. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources