Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Nov;174(3):1407-20.
doi: 10.1534/genetics.106.062455. Epub 2006 Sep 1.

Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: a test case in the euasterid plant clade

Affiliations
Comparative Study

Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: a test case in the euasterid plant clade

Feinan Wu et al. Genetics. 2006 Nov.

Abstract

We report herein the application of a set of algorithms to identify a large number (2869) of single-copy orthologs (COSII), which are shared by most, if not all, euasterid plant species as well as the model species Arabidopsis. Alignments of the orthologous sequences across multiple species enabled the design of "universal PCR primers," which can be used to amplify the corresponding orthologs from a broad range of taxa, including those lacking any sequence databases. Functional annotation revealed that these conserved, single-copy orthologs encode a higher-than-expected frequency of proteins transported and utilized in organelles and a paucity of proteins associated with cell walls, protein kinases, transcription factors, and signal transduction. The enabling power of this new ortholog resource was demonstrated in phylogenetic studies, as well as in comparative mapping across the plant families tomato (family Solanaceae) and coffee (family Rubiaceae). The combined results of these studies provide compelling evidence that (1) the ancestral species that gave rise to the core euasterid families Solanaceae and Rubiaceae had a basic chromosome number of x=11 or 12.2) No whole-genome duplication event (i.e., polyploidization) occurred immediately prior to or after the radiation of either Solanaceae or Rubiaceae as has been recently suggested.

PubMed Disclaimer

Figures

F<sc>igure</sc> 1.—
Figure 1.—
(A) Phylogenetic tree showing relationship placement of euasterids to other eudicot plant species on the basis of APG II 2003 (Bremer et al. 2003). (B) Phylogenetic relationships of the plant species included in this study. The maximum-likelihood tree was reconstructed using published chloroplast ndhF sequences (each species name is followed by its GenBank accession number). Bootstrap values are placed on the branches. The tree is consistent with previous reports (Chase et al. 1993; Olmstead 1999).
F<sc>igure</sc> 2.—
Figure 2.—
(A) Evolution of an ancestral single-copy gene into three single copy orthologs (A1, B1, C1), one in each of the three related species. (B) RBM relationships of single-copy orthologs (A1, B1, C1) from pairwise comparisons of three fully sequenced genomes. (C) Evolution of paralogs, created by ancestral gene duplication, in the genomes of three related species. (D) Application of RBM triangulation method to distinguish orthologs from paralogs from pairwise comparisons of three fully sequenced genomes. (E and F) Application of RBM triangulation in genomes of three related species that have incomplete sequence data sets. Dotted circles indicate paralogous genes missing from the data set. Dashed arrowed lines connect two paralogs that form an erroneous RBM pair.
F<sc>igure</sc> 3.—
Figure 3.—
Comparisons of role categorization between the entire Arabidopsis genome and COSII genes based on gene ontology (GO) annotation of the Arabidopsis genes. Stars represent categories showing significant differences (individual P < 0.001 for an overall significance level of 0.05) between COSII genes and the entire Arabidopsis repertoire.
F<sc>igure</sc> 4.—
Figure 4.—
Design of universal primers for euasterid I species (UPA) in a COSII group. (A) Multiple alignment of euasterid I species and the corresponding Arabidopsis ortholog. Intron positions of euasterid I species were predicted on the basis of that of the Arabidopsis ortholog. UPAs were designed in conserved portions of exons. (B) iUPAs amplify mostly intronic sequences including <400 bp of the flanking exons, while eUPAs amplify at least 400-bp exonic sequences with or without the intervening intron(s).
F<sc>igure</sc> 5.—
Figure 5.—
Use of universal primers (UPAs) to amplify orthologous counterparts from genomic DNA of different solanaceous species and coffee. (A) Amplification by iUPAs for C2_At1g13380. (B) Part of the sequence alignment of amplified sequences by iUPAs for C2_At1g13380. Asterisks indicate identical sites in the multiple sequence alignment.
F<sc>igure</sc> 6.—
Figure 6.—
Distribution of COSII genes (Arabidopsis orthologs) across the Arabidopsis genome. Each Arabidopsis chromosome was divided into sequential bins of 100 genes each (including all the predicted nuclear genes in TAIR, Table 1). The number of observed COSII genes per bin was then plotted for all five Arabidopsis chromosomes.
F<sc>igure</sc> 7.—
Figure 7.—
(A) Predicted outcomes when mapping COSII orthologs between a species (Solanaceae, tomato) with a hypothetical whole-genome duplication (polyploidy) at the base of its lineage compared with a related taxon (Rubiaceae, coffee) that did not experience such an event. The lineage with a polyploidization event would have all chromosomes (and genes) duplicated, followed by selective gene loss. (B) When a whole-genome duplication event occurred only in the Solanaceae lineage, mapping of COSII genes between coffee and tomato led to a “network of synteny” as described in Ku et al. (2000). Note that no such “network of synteny” has been observed in mapping tomato and coffee with a set of >150 COSII genes—casting doubt on the polyploidy event in the Solanaceae lineage proposed by Blanc and Wolfe (2004). (C) Comparative mapping in coffee of COSII markers located on the short arm of tomato chromosome 7 (top) and long arm of tomato chromosome 7 (bottom). Note the one-to-one relationship between the coffee–tomato syntenous regions, which would be predicted if no polyploidy event had occurred in either the Solanaceae or the Rubiaceae lineage either after or just prior to divergence from their last common ancestor.
F<sc>igure</sc> 8.—
Figure 8.—
Use of COSII universal primers (UPAs) for phylogenetics. (A) Phylogenetic relationships among euasterid I species from different genera and families using eUPAs that amplify orthologous exons. The tree is based on concatenated exonic sequences corresponding to 10 COSII genes totaling 3750 bp. (B) Phylogenetic relationships among the species most closely related to the cultivated tomato based on concatenated exonic sequences of 2316 bp from 7 COSII genes. (C) Phylogenetic relationships for the same species, but based on concatenated intronic sequences of 2403 bp from 5 COSII genes.

References

    1. Arabidopsis Genome Initiative, 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796–815. - PubMed
    1. Berardini, T. Z., S. Mundodi, L. Reiser, E. Huala, M. Garcia-Hernandez et al., 2004. Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135: 745–755. - PMC - PubMed
    1. Blanc, G., and K. H. Wolfe, 2004. Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16: 1667–1678. - PMC - PubMed
    1. Blanc, G., A. Barakat, R. Guyot, R. Cooke and M. Delseny, 2000. Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell 12: 1093–1101. - PMC - PubMed
    1. Blanchette, M., E. D. Green, W. Miller and D. Haussler, 2004. Reconstructing large regions of an ancestral mammalian genome in silico. Genome Res. 14: 2412–2423. - PMC - PubMed

Publication types

Associated data