Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Jul;17(7):1072-81.
doi: 10.1101/gr.6214107. Epub 2007 Jun 7.

Genome-wide comparative analysis of copia retrotransposons in Triticeae, rice, and Arabidopsis reveals conserved ancient evolutionary lineages and distinct dynamics of individual copia families

Affiliations

Genome-wide comparative analysis of copia retrotransposons in Triticeae, rice, and Arabidopsis reveals conserved ancient evolutionary lineages and distinct dynamics of individual copia families

Thomas Wicker et al. Genome Res. 2007 Jul.

Abstract

Although copia retrotransposons are major components of all plant genomes, the evolutionary relationships between individual copia families and between elements from different plant species are only poorly studied. We used 20 copia families from the large-genome plants barley and wheat to identify 46 families of homologous copia elements from rice and 22 from Arabidopsis, two plant species with much smaller genomes. In total, 599 copia elements were analyzed. Phylogenetic analysis showed that copia elements from the four species can be classified into six ancient lineages that existed before the divergence of monocots and dicots. The six lineages show a surprising degree of conservation in sequence organization and other characteristics across species. Additionally, the phylogenetic data suggest at least one case of horizontal gene transfer between the Arabidopsis and rice lineages. Insertion time estimates for 522 high-copy elements showed that retrotransposons from rice were active at different times in waves of activity lasting 0.5-2 million years, depending on the family, whereas elements from wheat and barley had longer periods of activity. We estimated that half of the rice copia elements are truncated or otherwise rearranged after approximately 790,000 yr, which is almost twice the half-life of Arabidopsis elements. In contrast, wheat and barley copia elements appear to have a massively longer half-life, beyond our ability to estimate from the available data. These findings suggest that genome size can be explained by the specific rate of DNA removal from the genome and the length of active periods of retrotransposon families.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Phylogenetic tree of 52 copia families from Triticeae, rice, and Arabidopsis. Names of copia families from Arabidopsis are printed in red, those from rice in blue, and those from Triticeae in green. Subfamilies are indicated by a capital letter at the end of the name. Black uppercase letters refer to the type of primer binding site (PBS) and polypurine tract (PPT) detailed in Figure 2. Asterisks indicate the presence of additional, closely related families that had been omitted due to space constraints. Copy numbers in these cases refer to the total of all represented families (see Supplemental Fig. 1). Bootstrap numbers at the forks indicate how many times the sequences to the right of the fork occurred in the same group of 100 trees. Strong bootstrap values of at least 80 are shown in black. The copy numbers and sequence organization of all families are displayed next to the respective names. Major evolutionary lineages are indicated by curly brackets. A reverse transcriptase sequence from yeast (ScRT) served as outgroup.
Figure 2.
Figure 2.
Comparison of primer binding sites (PBS) and polypurine tracts (PPT) of the six evolutionary copia lineages. The tree diagram at the left illustrates the relationship between the six evolutionary lineages as determined through the phylogenetic analysis shown in Figure 1.The terminal two bases at the 5′ end of PBS and at the 3′ end of PPT belong to the 5′ and 3′ LTRs, respectively. Shown are consensus sequences for five main lineages as well as the Ale lineage, which is split into three sublineages due to strong divergence within the Ale lineage. Each PBS/PPT pair is given a “type” index that corresponds to the one in Figure 1. The full-sequences alignments are available as Supplemental Figure 2.
Figure 3.
Figure 3.
Deletion derivatives in copia elements of the Maximus lineage. Gray areas represent regions conserved between elements with the degree of DNA sequence identity indicated. The low degree of sequence conservation (69%) between the rice Osr9 elements and the Triticeae Barbara elements shows that the loss of the ORF2 region in each family was the result of an independent deletion event.
Figure 4.
Figure 4.
Estimated insertion times of high-copy copia families from Triticeae and rice. The families studied are listed in the leftmost column. Numbers of elements analyzed for each family are given in the second column. Individual insertion events are indicated as vertical red lines. The x-axis scale indicates the DNA sequence identity between LTRs of a particular element (top) and the estimated insertion time derived from it using a basic substitution rate of 1.3 × 10−8 per site per year (bottom). Asterisks indicate whether the distribution is significantly different from a uniform distribution (P < 0.05), before (green) and after (blue) Bonferroni correction.
Figure 5.
Figure 5.
Distribution of copia element insertion times. Estimated insertion times were divided into bins of 100,000 yr. (A) The distribution of rice copia elements can be described as a hyperbolic function based on the assumption that retrotransposon sequences are at least partially removed from the genome at a constant rate. The curve represents the best fit for the starting value (43.6) and the half-life (796,000). (B) Insertion distribution of 87 copia elements from Triticeae (wheat and barley). The distribution is not similar to a hyperbolic distribution.

References

    1. Altschul S.F., Madden T.L., Schaffer A.A., Zhang J.H., Zhang Z., Miller W., Lipman D.J., Madden T.L., Schaffer A.A., Zhang J.H., Zhang Z., Miller W., Lipman D.J., Schaffer A.A., Zhang J.H., Zhang Z., Miller W., Lipman D.J., Zhang J.H., Zhang Z., Miller W., Lipman D.J., Zhang Z., Miller W., Lipman D.J., Miller W., Lipman D.J., Lipman D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. ArabidopsisGenome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. - PubMed
    1. Bennett M.D., Smith J.B., Smith J.B. Nuclear DNA amounts in angiosperms. Philos. Trans. R. Soc. Lond. B Biol. Sci. 1976;274:227–274. - PubMed
    1. Bossolini E., Wicker T., Knobel P., Keller B., Wicker T., Knobel P., Keller B., Knobel P., Keller B., Keller B. Comparison of orthologous loci from small grass genomes Brachypodium and rice: Implications for wheat genomics and grass genome annotation. Plant J. 2007;49:704–717. - PubMed
    1. Bureau T., Wessler S.R., Wessler S.R. Stowaway: A new family of inverted repeat elements associated with the genes of both monocotyledonous and dicotyledonous plants. Proc. Natl. Acad. Sci. 1994;9:1411–1415. - PMC - PubMed

Publication types

LinkOut - more resources