Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Aug 21:6:64.
doi: 10.1186/1471-2148-6-64.

The monosaccharide transporter gene family in land plants is ancient and shows differential subfamily expression and expansion across lineages

Affiliations

The monosaccharide transporter gene family in land plants is ancient and shows differential subfamily expression and expansion across lineages

Deborah A Johnson et al. BMC Evol Biol. .

Abstract

Background: In plants, tandem, segmental and whole-genome duplications are prevalent, resulting in large numbers of duplicate loci. Recent studies suggest that duplicate genes diverge predominantly through the partitioning of expression and that breadth of gene expression is related to the rate of gene duplication and protein sequence evolution.Here, we utilize expressed sequence tag (EST) data to study gene duplication and expression patterns in the monosaccharide transporter (MST) gene family across the land plants. In Arabidopsis, there are 53 MST genes that form seven distinct subfamilies. We created profile hidden Markov models of each subfamily and searched EST databases representing diverse land plant lineages to address the following questions: 1) Are homologs of each Arabidopsis subfamily present in the earliest land plants? 2) Do expression patterns among subfamilies and individual genes within subfamilies differ across lineages? 3) Has gene duplication within each lineage resulted in lineage-specific expansion patterns? We also looked for correlations between relative EST database representation in Arabidopsis and similarity to orthologs in early lineages.

Results: Homologs of all seven MST subfamilies were present in land plants at least 400 million years ago. Subfamily expression levels vary across lineages with greater relative expression of the STP, ERD6-like, INT and PLT subfamilies in the vascular plants. In the large EST databases of the moss, gymnosperm, monocot and eudicot lineages, EST contig construction reveals that MST subfamilies have experienced lineage-specific expansions. Large subfamily expansions appear to be due to multiple gene duplications arising from single ancestral genes. In Arabidopsis, one or a few genes within most subfamilies have much higher EST database representation than others. Most highly represented (broadly expressed) genes in Arabidopsis have best match orthologs in early divergent lineages.

Conclusion: The seven subfamilies of the Arabidopsis MST gene family are ancient in land plants and show differential subfamily expression and lineage-specific subfamily expansions. Patterns of gene expression in Arabidopsis and correlation of highly represented genes with best match homologs in early lineages suggests that broadly expressed genes are often highly conserved, and that most genes have more limited expression.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Maximum likelihood phylogeny of Arabidopsis MSTproteins. An unrooted phylogeny of the 53 Arabidopsis MST protein sequences inferred using maximum likelihood. The tree was produced using PHYML with the JTT amino acid substitution model, a discrete gamma model with four categories and an estimated shape parameter of 1.385. Bootstrapping was performed with 100 replicates. Bootstrap values for each subfamily clade are highlighted in yellow. Call-outs show available information about the function and expression of some MST genes, from Arabidopsis and other taxa, within each subfamily.
Figure 2
Figure 2
Representation of MST genes in Arabidopsis EST database. This tree is a radial representation of the maximum likelihood protein tree in Figure 1. Bootstrap values have been omitted and branch lengths have been modified to enhance visibility. Branches with yellow highlighting indicate the presence of ESTs in the Arabidopsis thaliana EST database of 415,250 ESTs. Callouts show the total number of ESTs with a best match to each indicated gene locus and the percentage of total subfamily expression levels. Red asterisks indicate genes with best match homologs present in at least one early lineage (Marchantia, Physcomitrella, Selaginella, Ceratopteris, or Pinus).
Figure 3
Figure 3
Expressed MST loci in small EST databases. Radial ML tree of Arabidopsis MST proteins with branches highlighted in yellow to denote the presence of ESTs in one of the small EST databases with a best match to that particular Arabidopsis gene. Callouts label the species name and e-value of the match.
Figure 4
Figure 4
Expressed MST loci in the Physcomitrella patens EST database. Radial ML tree of Arabidopsis MST proteins with branches highlighted in yellow to indicate the presence of EST contigs or singlets in the Physcomitrella patens EST databases with a best match to the indicated Arabidopsis gene. Callouts indicate Number of inferred expressed loci [number of EST contigs(# of ESTs in each contig), number of singlets].
Figure 5
Figure 5
Expressed MST loci in the Pinus taeda EST database. Radial ML tree of Arabidopsis MST proteins with branches highlighted in yellow to indicate the presence of EST contigs or singlets in the Pinus taeda EST database with a best match to the indicated Arabidopsis gene. Callouts indicate Number of inferred expressed loci [number of EST contigs(# of ESTs in each contig), number of singlets].
Figure 6
Figure 6
Expressed MST loci in the Zea mays EST database. Radial ML tree of Arabidopsis MST proteins with branches highlighted in yellow to indicate the presence of EST contigs or singlets in the Zea mays EST database with a best match to the indicated Arabidopsis gene. Callouts indicate Number of inferred expressed loci [number of EST contigs(# of ESTs in each contig), number of singlets].
Figure 7
Figure 7
Expressed MST loci in the Lycopersicon esculentum EST database. Radial ML tree of Arabidopsis MST proteins with branches highlighted in yellow to indicate the presence of EST contigs or singlets in the Lycopersicon esculentum EST database with a best match to the indicated Arabidopsis gene. Callouts indicate Number of inferred expressed loci [number of EST contigs(# of ESTs in each contig), number of singlets].
Figure 8
Figure 8
Lineage divergence times, inferred polyploidy events and number of MST subfamily loci inferred from EST data, presented in phylogenetic context. Phylogeny showing hypothesized relationships among major land plant lineages [70], approximate divergence times [70], with vertical bars indicating inferred polyploidy events [61]. Colored squares indicate presence of one or more subfamily homologs within a lineage, numbers within squares indicate the number of expressed loci, and *'s indicate EST databases with too few ESTs to infer numbers of expressed loci. Species names on selected lineages indicate EST databases searched in this study.
Figure 9
Figure 9
Arabidopsis MST gene duplication events in phylogenetic context. Maximum likelihood phylogeny of Arabidopsis MST protein sequences with segmental duplication events indicated by callouts and tandem duplications indicated by yellow highlighting. Red * symbols indicate two genes with high similarity likely duplicated by segmental duplication unrecognized on the TIGR Arabidopsis Genome Annotation database.
Figure 10
Figure 10
Correlation between high EST database representation of Arabidopsis MST genes and the presence of best match orthologs in one or more early land plant lineages. A bar chart showing the relationship between percent relative subfamily EST database representation and presence of best match orthologs in one or more representatives of five early land plant lineages (Marchantia polymorpha, Physcomitrella patens, Selaginella lepidophylla, Ceratopteris richardii, and Pinus taeda). Arabidopsis genes with high relative subfamily representation and/or best match homologs in the early lineages are included in the chart.

Similar articles

Cited by

References

    1. Taylor JS, Raes J. Duplication and divergence: The evolution of new genes and old ideas. Annu Rev Genet. 2004;38:615–643. - PubMed
    1. Ohno S. Evolution by Gene Duplication. Springer-Verlag; 1970. p. 160.
    1. Walsh JB. How often do duplicated genes evolve new functions? Genetics. 1995;139:421–428. - PMC - PubMed
    1. Ohta T. Further simulation studies on the evolution by gene duplication. Evolution. 1988;42:375–386. - PubMed
    1. Moore RC, Purugganan MD. The evolutionary dynamics of plant duplicate genes. Curr Opin Plant Biol. 2005;8:122–128. - PubMed

Publication types

Substances