De novo assembly and validation of planaria transcriptome by massive parallel sequencing and shotgun proteomics
- PMID: 21536722
- PMCID: PMC3129261
- DOI: 10.1101/gr.113779.110
De novo assembly and validation of planaria transcriptome by massive parallel sequencing and shotgun proteomics
Abstract
Freshwater planaria are a very attractive model system for stem cell biology, tissue homeostasis, and regeneration. The genome of the planarian Schmidtea mediterranea has recently been sequenced and is estimated to contain >20,000 protein-encoding genes. However, the characterization of its transcriptome is far from complete. Furthermore, not a single proteome of the entire phylum has been assayed on a genome-wide level. We devised an efficient sequencing strategy that allowed us to de novo assemble a major fraction of the S. mediterranea transcriptome. We then used independent assays and massive shotgun proteomics to validate the authenticity of transcripts. In total, our de novo assembly yielded 18,619 candidate transcripts with a mean length of 1118 nt after filtering. A total of 17,564 candidate transcripts could be mapped to 15,284 distinct loci on the current genome reference sequence. RACE confirmed complete or almost complete 5' and 3' ends for 22/24 transcripts. The frequencies of frame shifts, fusion, and fission events in the assembled transcripts were computationally estimated to be 4.2%-13%, 0%-3.7%, and 2.6%, respectively. Our shotgun proteomics produced 16,135 distinct peptides that validated 4200 transcripts (FDR ≤1%). The catalog of transcripts assembled in this study, together with the identified peptides, dramatically expands and refines planarian gene annotation, demonstrated by validation of several previously unknown transcripts with stem cell-dependent expression patterns. In addition, our robust transcriptome characterization pipeline could be applied to other organisms without genome assembly. All of our data, including homology annotation, are freely available at SmedGD, the S. mediterranea genome database.
Figures




Similar articles
-
Smed454 dataset: unravelling the transcriptome of Schmidtea mediterranea.BMC Genomics. 2010 Dec 31;11:731. doi: 10.1186/1471-2164-11-731. BMC Genomics. 2010. PMID: 21194483 Free PMC article.
-
SmedGD 2.0: The Schmidtea mediterranea genome database.Genesis. 2015 Aug;53(8):535-46. doi: 10.1002/dvg.22872. Epub 2015 Jul 17. Genesis. 2015. PMID: 26138588 Free PMC article.
-
Comparative transcriptomic analyses and single-cell RNA sequencing of the freshwater planarian Schmidtea mediterranea identify major cell types and pathway conservation.Genome Biol. 2018 Aug 24;19(1):124. doi: 10.1186/s13059-018-1498-x. Genome Biol. 2018. PMID: 30143032 Free PMC article.
-
Basal bodies across eukaryotes series: basal bodies in the freshwater planarian Schmidtea mediterranea.Cilia. 2016 Mar 19;5:15. doi: 10.1186/s13630-016-0037-1. eCollection 2016. Cilia. 2016. PMID: 26998257 Free PMC article. Review.
-
Single-cell transcriptomics in planaria: new tools allow new insights into cellular and evolutionary features.Biochem Soc Trans. 2022 Oct 31;50(5):1237-1246. doi: 10.1042/BST20210825. Biochem Soc Trans. 2022. PMID: 36281987 Free PMC article. Review.
Cited by
-
Towards a bioinformatics of patterning: a computational approach to understanding regulative morphogenesis.Biol Open. 2013 Feb 15;2(2):156-69. doi: 10.1242/bio.20123400. Epub 2012 Nov 26. Biol Open. 2013. PMID: 23429669 Free PMC article.
-
Transcriptome analysis reveals strain-specific and conserved stemness genes in Schmidtea mediterranea.PLoS One. 2012;7(4):e34447. doi: 10.1371/journal.pone.0034447. Epub 2012 Apr 4. PLoS One. 2012. PMID: 22496805 Free PMC article.
-
Preparation of the planarian Schmidtea mediterranea for high-resolution histology and transmission electron microscopy.Nat Protoc. 2014 Mar;9(3):661-73. doi: 10.1038/nprot.2014.041. Epub 2014 Feb 20. Nat Protoc. 2014. PMID: 24556788 Free PMC article.
-
Transcriptome analysis elucidates key developmental components of bryozoan lophophore development.Sci Rep. 2014 Oct 10;4:6534. doi: 10.1038/srep06534. Sci Rep. 2014. PMID: 25300304 Free PMC article.
-
A de novo assembly of the newt transcriptome combined with proteomic validation identifies new protein families expressed during tissue regeneration.Genome Biol. 2013 Feb 20;14(2):R16. doi: 10.1186/gb-2013-14-2-r16. Genome Biol. 2013. PMID: 23425577 Free PMC article.
References
-
- Agata K 2003. Regeneration and gene regulation in planarians. Curr Opin Genet Dev 13: 492–496 - PubMed
-
- Alexeyenko A, Tamas I, Liu G, Sonnhammer ELL 2006. Automatic clustering of orthologs and inparalogs shared by multiple proteomes. Bioinformatics 22: e9–e15 - PubMed
-
- Cox J, Mann M 2008. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26: 1367–1372 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Research Materials