Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Feb;16(2):260-70.
doi: 10.1101/gr.4361206. Epub 2005 Dec 14.

Identification of transposable elements using multiple alignments of related genomes

Affiliations

Identification of transposable elements using multiple alignments of related genomes

Anat Caspi et al. Genome Res. 2006 Feb.

Abstract

Accurate genome-wide cataloging of transposable elements (TEs) will facilitate our understanding of mobile DNA evolution, expose the genomic effects of TEs on the host genome, and improve the quality of assembled genomes. Using the availability of several nearly complete Drosophila genomes and developments in whole genome alignment methods, we introduce a large-scale comparative method for identifying repetitive mobile DNA regions. These regions are highly enriched for transposable elements. Our method has two main features distinguishing it from other repeat-finding methods. First, rather than relying on sequence similarity to determine the location of repeats, the genomic artifacts of the transposition mechanism itself are systematically tracked in the context of multiple alignments. Second, we can derive bounds on the age of each repeat instance based on the phylogenetic species tree. We report results obtained using both complete and draft sequences of four closely related Drosophila genomes and validate our results with manually curated TE annotations in the Drosophila melanogaster euchromatin. We show the utility of our findings in exploring both transposable elements and their host genomes: In the study of TEs, we offer predictions for novel families, annotate new insertions of known families, and show data that support the hypothesis that all known TE families in D. melanogaster were recently active; in the study of the host, we show how our findings can be used to determine shifts in the eu-heterochromatin junction in the pericentric chromosome regions.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
A replication event with the resulting insertion region and juxtaposed gaps in the multiple sequence alignment of the panortholog subsequences.
Figure 2.
Figure 2.
An alignment of four Drosophila sequences shown in the K-browser. The species are labeled by D. melanogaster, D. yakuba, D. pseudo (pseudoobscura), D. virilis. Each genome track has a conservation score track (pink) and (c) a gap track (gaps are demarcated in gray). The gaps in three genomes support the correct boundaries of the D. melanogaster insertion regions (a1, a2, and b). The insertion regions match the TE annotations (blue) in the BDGP noncoding gene track (d). The tree on the left-hand side depicts the phylogeny relationships between the species. The diamond shows the branch on which the transposon replications (a1) and (a2) occurred. The “ancient” branch is that on which replication (b) occurred, as indicated by the gaps in D. pseudoobscura and D. virilis but not in D. yakuba.
Figure 3.
Figure 3.
Distribution of TEs, true positives, false negatives, and new findings along chromosome arms.
Figure 4.
Figure 4.
Proximal region of chromosome 3L, contrasting distribution of “recent” and “ancient” findings.

References

    1. Agarwal, P. and States, D. 1994. The repeat pattern toolkit (RPT): Analyzing the structure and evolution of the C. elegans genome. Proc. Int. Conf. Intel. Syst. Mol. Biol. 2 1-9. - PubMed
    1. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215 403-410. - PubMed
    1. Bao, Z. and Eddy, S.R. 2002. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 112 1269-1276. - PMC - PubMed
    1. Bartolome, C., Maside, X., and Charlesworth, B. 2002. On the abundance and distribution of transposable elements in the genome of Drosophila melanogaster. Mol. Biol. Evol. 19 926-937. - PubMed
    1. Bedell, J., Korf, I., and Gish, W. 2000. Maskeraid: A performance enhancement for RepeatMasker. Bioinformatics 16 1040-1041. - PubMed

LinkOut - more resources