Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jul 7:11:420.
doi: 10.1186/1471-2164-11-420.

The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

Affiliations

The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

Allen Kovach et al. BMC Genomics. .

Abstract

Background: In today's age of genomic discovery, no attempt has been made to comprehensively sequence a gymnosperm genome. The largest genus in the coniferous family Pinaceae is Pinus, whose 110-120 species have extremely large genomes (c. 20-40 Gb, 2N = 24). The size and complexity of these genomes have prompted much speculation as to the feasibility of completing a conifer genome sequence. Conifer genomes are reputed to be highly repetitive, but there is little information available on the nature and identity of repetitive units in gymnosperms. The pines have extensive genetic resources, with approximately 329000 ESTs from eleven species and genetic maps in eight species, including a dense genetic map of the twelve linkage groups in Pinus taeda.

Results: We present here the Sanger sequence and annotation of ten P. taeda BAC clones and Genome Analyzer II whole genome shotgun (WGS) sequences representing 7.5% of the genome. Computational annotation of ten BACs predicts three putative protein-coding genes and at least fifteen likely pseudogenes in nearly one megabase of sequence. We found three conifer-specific LTR retroelements in the BACs, and tentatively identified at least 15 others based on evidence from the distantly related angiosperms. Alignment of WGS sequences to the BACs indicates that 80% of BAC sequences have similar copies (> or = 75% nucleotide identity) elsewhere in the genome, but only 23% have identical copies (99% identity). The three most common repetitive elements in the genome were identified and, when combined, represent less than 5% of the genome.

Conclusions: This study indicates that the majority of repeats in the P. taeda genome are 'novel' and will therefore require additional BAC or genomic sequencing for accurate characterization. The pine genome contains a very large number of diverged and probably defunct repetitive elements. This study also provides new evidence that sequencing a pine genome using a WGS approach is a feasible goal.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pinus taeda BAC12 (clone Pt314B2) illustrates several new trends found in the pine genome. (A) The length of BAC12 is shown along the horizontal axis. Shown above the axis are tracks of annotated genes (dicot and monocot parameters), similarity hits to Repbase [RM (blastx); DNA transposons; Non-LTR retroelements; ERV (endogenous retroviruses); LTR retroelements, copia; LTR retroelements, gypsy], and other elements identified in this study (simple repeats, tandem repeats, ORF elements, pairs of direct repeats, and regions of similarity among BACs). The bottom two tracks indicate WGS coverage at ≥ 75% identity and at ≥ 99% identity (B) Genes were annotated with both dicot and monocot parameters. The annotations generally differed in gene structure. (C) Coverage is similar between coverage tracks for active and relatively abundant retroelements in the pine genome such as this nested PtIFG7.
Figure 2
Figure 2
Comparison of repeat content among twelve sequenced genomes and Pinus taeda.

Similar articles

Cited by

References

    1. Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408(6814):796–815. doi: 10.1038/35048692. - DOI - PubMed
    1. Eckert AJ, Hall BD. Phylogeny, historical biogeography, and patterns of diversification for Pinus (Pinaceae): Phylogenetic tests of fossil-based hypotheses. Mol Phylogenet Evol. 2006;40(1):166–182. doi: 10.1016/j.ympev.2006.03.009. - DOI - PubMed
    1. Wakamiya I, Newton RJ, Johnston JS, Price HJ. Genome Size and Environmental Factors in the Genus Pinus. American Journal of Botany. 1993;80(11):1235–1241. doi: 10.2307/2445706. - DOI
    1. Rabinowicz PD. Constructing gene-enriched plant genomic libraries using methylation filtration technology. Methods Mol Biol. 2003;236:21–36. - PubMed
    1. Rabinowicz PD, Citek R, Budiman MA, Nunberg A, Bedell JA, Lakey N, O'Shaughnessy AL, Nascimento LU, McCombie WR, Martienssen RA. Differential methylation of genes and repeats in land plants. Genome Res. 2005;15(10):1431–1440. doi: 10.1101/gr.4100405. - DOI - PMC - PubMed

Publication types