Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Dec;204(4):1613-1626.
doi: 10.1534/genetics.116.193227. Epub 2016 Oct 28.

Sequence of the Sugar Pine Megagenome

Affiliations

Sequence of the Sugar Pine Megagenome

Kristian A Stevens et al. Genetics. 2016 Dec.

Abstract

Until very recently, complete characterization of the megagenomes of conifers has remained elusive. The diploid genome of sugar pine (Pinus lambertiana Dougl.) has a highly repetitive, 31 billion bp genome. It is the largest genome sequenced and assembled to date, and the first from the subgenus Strobus, or white pines, a group that is notable for having the largest genomes among the pines. The genome represents a unique opportunity to investigate genome "obesity" in conifers and white pines. Comparative analysis of P. lambertiana and P. taeda L. reveals new insights on the conservation, age, and diversity of the highly abundant transposable elements, the primary factor determining genome size. Like most North American white pines, the principal pathogen of P. lambertiana is white pine blister rust (Cronartium ribicola J.C. Fischer ex Raben.). Identification of candidate genes for resistance to this pathogen is of great ecological importance. The genome sequence afforded us the opportunity to make substantial progress on locating the major dominant gene for simple resistance hypersensitive response, Cr1 We describe new markers and gene annotation that are both tightly linked to Cr1 in a mapping population, and associated with Cr1 in unrelated sugar pine individuals sampled throughout the species' range, creating a solid foundation for future mapping. This genomic variation and annotated candidate genes characterized in our study of the Cr1 region are resources for future marker-assisted breeding efforts as well as for investigations of fundamental mechanisms of invasive disease and evolutionary response.

Keywords: conifer genome; transposable elements; white pine blister rust.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) The phylogeny of major genera within the Pinaceae along with genome size estimates. P. lambertiana falls in the Strobus subgenus. Inference was conducted using Bayesian analysis as implemented in BEAST ver. 2.2.0 (Bouckaert et al. 2014). Gray bars represent the 95% highest posterior density range for the age of the node. Data used for inference were 28 independent nuclear gene regions (see Eckert et al. 2013a,b), sequenced and assembled for representative taxa selected within each taxonomic group [Pinus subg. Pinus: P. taeda; Pinus subg. Strobus: P. lambertiana; Picea: P. abies; Pseudotsuga: P. menziesii (Mirb.) Franco; Larix: L. decidua Mill.; Abies: A. alba Mill.]. Details are presented in the Supplementary Methods in File S1 (B) Illustration of the genome size trends of major genera within Pinaceae. Genome sizes are from the c-values database (Bennett and Leitch 2012). Diamonds mark the estimates of genomes with a reference sequence. Point estimates in each category are shown as short horizontal lines. Species from other genera within the Pinaceae are shown in gray.
Figure 2
Figure 2
Overview of sequencing and assembly strategy for P. lambertiana. Woodcut images used with permission from “The trees of Yosemite; a popular account.” Library of Congress call number QK484.C2 T7 1932.
Figure 3
Figure 3
Comparison of repetitive content between transposable element repeat families in P. lambertiana (top) and P. taeda (bottom).
Figure 4
Figure 4
Annotated scaffolds and elements linked to Cr1. On the left is a tentative map of the Cr1 region of chromosome 11 showing the positions of identified markers. The gene order shown was derived from Harkins et al. (1998) (BC_432_1110 labeled cr1lB, Cr1, OPG_16_950 labeled cr1lA) and Jermstad et al. (2011) (Cr1, scarOPG_16_950, UMN_3258 labeled cr1lC). To the right are five scaffolds and 14 gene annotations that are linked to the Cr1 gene. The evidence of expression of PILA_lg017786 was a single transcript (red bar) assembled from a library constructed from a resistant tree (Supplementary Methods in File S1). Scaffold super6135 is physically linked to scaffold 370413 that harbors cr1lB by two fosmid DiTags.
Figure 5
Figure 5
Multiple alignment of association samples showing the most variable sites, 40% or more consensus differences. The numbered five site Cr1 linked motif is seen as two haplotypes, the Cr1r linked GCGGC and the Cr1R linked TTACT. One haplotype (SP-K-0142-U.2) transmitted from a Cr1r/Cr1r parent genotyped as a putative Cr1R linked “TTACT” recombinant.

References

    1. Abrusán G., Grundmann N., DeMester L., Makalowski W., 2009. TEclass—a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics 25(10): 1329–1330. - PubMed
    1. Ahuja M. R., Neale D. B., 2005. Evolution of genome size in conifers. Silvae Genet. 54(3): 126–137.
    1. American Forests, 2015 This Is It! The Quest for a New Champion Sugar Pine. Available at: http://www.americanforests.org/blog/quest-for-a-new-champion-sugar-pine/.
    1. Bao Z., Eddy S. R., 2002. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 12(8): 1269–1276. - PMC - PubMed
    1. Bennett, M. D., and I. J. Leitch, 2012 Plant DNA C-values database, release 6.0, Dec. 2012. Available at: http://data.kew.org/cvalues/.

Substances