Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 12:16:1057.
doi: 10.1186/s12864-015-2277-7.

Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome

Affiliations

Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome

Erik A Visser et al. BMC Genomics. .

Abstract

Background: Pines are the most important tree species to the international forestry industry, covering 42 % of the global industrial forest plantation area. One of the most pressing threats to cultivation of some pine species is the pitch canker fungus, Fusarium circinatum, which can have devastating effects in both the field and nursery. Investigation of the Pinus-F. circinatum host-pathogen interaction is crucial for development of effective disease management strategies. As with many non-model organisms, investigation of host-pathogen interactions in pine species is hampered by limited genomic resources. This was partially alleviated through release of the 22 Gbp Pinus taeda v1.01 genome sequence ( http://pinegenome.org/pinerefseq/ ) in 2014. Despite the fact that the fragmented state of the genome may hamper comprehensive transcriptome analysis, it is possible to leverage the inherent redundancy resulting from deep RNA sequencing with Illumina short reads to assemble transcripts in the absence of a completed reference sequence. These data can then be integrated with available genomic data to produce a comprehensive transcriptome resource. The aim of this study was to provide a foundation for gene expression analysis of disease response mechanisms in Pinus patula through transcriptome assembly.

Results: Eighteen de novo and two reference based assemblies were produced for P. patula shoot tissue. For this purpose three transcriptome assemblers, Trinity, Velvet/OASES and SOAPdenovo-Trans, were used to maximise diversity and completeness of assembled transcripts. Redundancy in the assembly was reduced using the EvidentialGene pipeline. The resulting 52 Mb P. patula v1.0 shoot transcriptome consists of 52 112 unigenes, 60 % of which could be functionally annotated.

Conclusions: The assembled transcriptome will serve as a major genomic resource for future investigation of P. patula and represents the largest gene catalogue produced to date for this species. Furthermore, this assembly can help detect gene-based genetic markers for P. patula and the comparative assembly workflow could be applied to generate similar resources for other non-model species.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Summarised assembly statistics for all preliminary assemblies. Pipt = Pinus patula (a) – Assembly size and length statistics. (b) –Transcript N statistics and GC ratio for all assemblies. In each case the right hand y-axis only applies to the dashed line. The first three Trinity assemblies were de novo assemblies using Dataset 1 with (50 k) and without (75 k) CuffFly, and using Dataset 2 (df). The last two Trinity assemblies represent reference guided assemblies using Dataset 1 (gg) and Dataset 2 (dfgg). For Velvet/Oases and SOAPdenovo-Trans, the numbers indicated the k-mer value used. ORF = open reading frame
Fig. 2
Fig. 2
Assembly statistics for tr2aacds pipeline merged assembly compared to average assembly statistics for each assembler. Assembly size and length statistics. The dashed y-axis only applies to the dashed line. Unfiltered output assemblies from Trinity, Velvet/Oases and SOAPdenovo-Trans were used
Fig. 3
Fig. 3
Unique orthologous protein groups identified through Tribe-MCL analysis. Left Comparison of protein family counts for all identified orthologous protein groups between five different plant classifications. Right Comparison of conifer specific protein counts between four conifer species. Dicots = Arabidopsis thaliana, Glycine max, Populus trichocarpa, Ricinus communis, Theobroma cacao, Vitis vinifera. Mosses = Selaginella moellendorffii, Physcomitrella patens. Monocots = Oryza sativa, Zea mays. Gymnosperms = Picea abies, Picea sitchensis, Pinus patula, Pinus taeda. Basal = Amborella trichopoda
Fig. 4
Fig. 4
Number of proteins per species for the eight most populated NB-ARC motif containing gene families identified. Gene families were identified using Tribe-MCL. Left – NB-ARC families most prominent in conifers. Right – NB-ARC families most prominent in angiosperms. Each color represents a different gene family (Additional file 2: Table S2). Family 2 (green bars) for P. taeda and P. patula had 852 and 1 794 members respectively

Similar articles

Cited by

References

    1. Critchfield W, Little E. Geographic distribution of pines of the world. USDA For Serv. 1966;991:1–97.
    1. Indufor: Forest Stewardship Council (FSC) Strategic Review on the Future of Forest Plantations. 2012:121.
    1. Wingfield MJ, Coutinho TA, Roux J, Wingfield BD. The future of exotic plantation forestry in the tropics and southern Hemisphere: Lessons from pitch canker. South Afr Forestry J. 2002;195:79–82. doi: 10.1080/20702620.2002.10434607. - DOI
    1. Wingfield MJ, Hammerbacher A, Ganley RJ, Steenkamp ET, Gordon TR, Wingfield BD, et al. Pitch canker caused by Fusarium circinatum - A growing threat to pine plantations and forests worldwide. Australas Plant Pathol. 2008;37:319–334. doi: 10.1071/AP08036. - DOI
    1. Hodge GR, Dvorak WS. Differential responses of Central American and Mexican pine species and Pinus radiata to infection by the pitch canker fungus. New For. 2000;19:241–258. doi: 10.1023/A:1006613021996. - DOI

MeSH terms