Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jul;16(7):934-46.
doi: 10.1101/gr.4708406. Epub 2006 Jun 7.

Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes

Affiliations

Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes

Brian C Thomas et al. Genome Res. 2006 Jul.

Abstract

Approximately 90% of Arabidopsis' unique gene content is found in syntenic blocks that were formed during the most recent whole-genome duplication. Within these blocks, 28.6% of the genes have a retained pair; the remaining genes have been lost from one of the homeologs. We create a minimized genome by condensing local duplications to one gene, removing transposons, and including only genes within blocks defined by retained pairs. We use a moving average of retained and non-retained genes to find clusters of retention and then identify the types of genes that appear in clusters at frequencies above expectations. Significant clusters of retention exist for almost all chromosomal segments. Detailed alignments show that, for 85% of the genome, one homeolog was preferentially (1.6x) targeted for fractionation. This homeolog fractionation bias suggests an epigenetic mechanism. We find that islands of retention contain "connected genes," those genes predicted-by the gene balance hypothesis-to be resistant to removal because the products they encode interact with other products in a dose-sensitive manner, creating a web of dependency. Gene families that are overrepresented in clusters include those encoding components of the proteasome/protein modification complexes, signal transduction machinery, ribosomes, and transcription factor complexes. Gene pair fractionation following polyploidy or segmental duplication leaves a genome enriched for "connected" genes. These clusters of duplicate genes may help explain the evolutionary origin of coregulated chromosomal regions and new functional modules.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Partial screenshot of a “cluster” in our Viewer aligning 42–43 kb of the α-syntenous region of chromosomes 2 and 4 anchored on a serine o-acetyltransferase gene pair (bold arrowhead). The colored rectangles are bl2seq HSPs (high scoring pairs) found using standard settings and e-value cutoffs (Inada et al. 2003) noted in the settings box. Black lines connect known α-pairs of genes. The red line connects genes into a pair whose subject genes (exons) were not called by TIGR, and is now called an “Our Additional” (_oa) gene in Supplemental material 1, Column A. The turquoise line connects two groups of syntenous HSPs that required further research to explain; these were eventually called “conserved non-coding sequences” belonging to the gene pair to the left.
Figure 2.
Figure 2.
Moving average gene retention frequency (y-axis) in an 80-gene window for each of the five Arabidopsis chromosomes. Chromosomes are represented by all genes encoding protein (Supplemental material 1, Column A), including genes duplicated locally and genes within transposons. The gray bands cover centromeric regions delineated by the most proximal genes with a known mutant phenotype (Meinke et al. 2003).
Figure 3.
Figure 3.
Three representative homeolog alignments showing different levels of fractionation bias. (A23) Typical α-region showing significant fractionation bias. (A14) Very significant fractionation bias. (A13) Insignificant fractionation bias. Each diagram is color-coded: retained genes are blue vertical lines, non-retained genes are gray vertical lines, and gaps are white space. The green-red bar above each block denotes the strand of the BLAST HSP, +/+ (green) and +/− (red) using the convention that the lower chromosome number of the pair is defined as intact, with the homeolog inverted to reconstitute synteny. The overfractionated homeolog has fewer genes than the underfractionated homeolog (Table 1), as expected and noted to the right of each alignment. There are no gaps >20 bp in these alignments. Gaps (white space) indicate the disparity between the numbers of non-retained genes on homeolog pairs. Ovals enclose particularly obvious clusters of retained genes that are much closer together that they were in the ancestors. The thin lines crossing over A13 illustrate how homeologous recombination could generate this segmentally scrambled alignment from two precursors displaying fractionation bias.
Figure 4.
Figure 4.
Moving average cluster plots for both α11 homeologous chromosomal segments using a 10-gene window. (A) Homeolog 11a. (B) Homeolog 11b. The y-axis is retention frequency; 1.0 means that all 10 genes were retained in that window. The x-axis is the sequence of genes in the Minimized α-region homeolog, as explained in Methods. In α11a, for example, there are 67 and 92 genes, retained and non-retained genes, respectively (Table 1, Row 11, columns “Genes >95%” and “a”).
Figure 5.
Figure 5.
GO categories evaluated for overabundance in clusters of retained genes as compared to retention expectations of the GO category treated independently. GO terms with fewer than six genes in retained clusters were omitted. Linear regression analysis of the scatterplot of all GO data plotted: the x-axis is the number of genes and the y-axis is the number of these genes positioned within retained cluster space. Using column headings from Table 1: X is the “Retained Frequency” and Y is the “Genes >95%, In Retained Clusters.” Points (individual GO terms) above the upper 95% confidence interval line are those terms found significantly more often than expected in clusters of retained genes.

Similar articles

Cited by

References

    1. Adams K.L., Percifield R., Wendel J.F., Percifield R., Wendel J.F., Wendel J.F. Organ-specific silencing of duplicated genes in a newly synthesized cotton allotetraploid. Genetics. 2004;168:2217–2226. - PMC - PubMed
    1. The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. - PubMed
    1. Avramova Z., Tikhonov A., Chen M., Bennetzen J.L., Tikhonov A., Chen M., Bennetzen J.L., Chen M., Bennetzen J.L., Bennetzen J.L. Matrix attachment regions and structural collinearity in the genomes of two grass species. Nucleic Acids Res. 1998;26:761–767. - PMC - PubMed
    1. Birchler J.A., Riddle N.C., Auger D.L., Veitia R.A., Riddle N.C., Auger D.L., Veitia R.A., Auger D.L., Veitia R.A., Veitia R.A. Dosage balance in gene regulation: Biological implications. Trends Genet. 2005;21:219–226. - PubMed
    1. Blanc G., Wolfe K.H., Wolfe K.H. Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell. 2004;16:1667–1678. - PMC - PubMed

Publication types

LinkOut - more resources