Types and frequencies of sequencing errors in methyl-filtered and high c0t maize genome survey sequences
- PMID: 15299135
- PMCID: PMC520775
- DOI: 10.1104/pp.104.041640
Types and frequencies of sequencing errors in methyl-filtered and high c0t maize genome survey sequences
Abstract
The Maize Genome Sequencing Consortium has deposited into GenBank more than 850,000 maize (Zea mays) genome survey sequences (GSSs) generated via two gene enrichment strategies, methylation filtration and high-C(0)t (HC) fractionation. These GSSs are a valuable resource for generating genome assemblies and the discovery of single nucleotide polymorphisms and nearly identical paralogs. Based on the rate of mismatches between 183 GSSs (105 methylation filtration + 78 HC) and 10 control genes, the rate of sequencing errors in these GSSs is 2.3 x 10(-3). As expected many of these errors were derived from insufficient vector trimming and base-calling errors. Surprisingly, however, some errors were due to cloning artifacts. These G.C to A.T transitions are restricted to HC clones; over 40% of HC clones contain at least one such artifact. Because it is not possible to distinguish the cloning artifacts from biologically relevant polymorphisms, HC sequences should be used with caution for the discovery of single nucleotide polymorphisms or paramorphisms. The average rate of sequencing errors was reduced 6-fold (to 3.6 x 10(-4)) by applying more stringent trimming parameters. This trimming resulted in the loss of only 11% of the bases (15,469/144,968). Due to redundancy among GSSs this more stringent trimming reduced coverage of promoters, exons, and introns by only 0%, 1%, and 4%, respectively. Hence, at the cost of a very modest loss of gene coverage, the quality of these maize GSSs can approach Bermuda standards, even prior to assembly.
Figures

Similar articles
-
Utility of different gene enrichment approaches toward identifying and sequencing the maize gene space.Plant Physiol. 2004 Oct;136(2):3023-33. doi: 10.1104/pp.104.043323. Epub 2004 Aug 6. Plant Physiol. 2004. PMID: 15299128 Free PMC article.
-
Enrichment of gene-coding sequences in maize by genome filtration.Science. 2003 Dec 19;302(5653):2118-20. doi: 10.1126/science.1090047. Science. 2003. PMID: 14684821
-
Polymorphism of PCR-based markers targeting exons, introns, promoter regions, and SSRs in maize and introns and repeat sequences in oat.Genome. 2001 Dec;44(6):1065-76. Genome. 2001. PMID: 11768210
-
Maize DNA-sequencing strategies and genome organization.Genome Biol. 2004;5(5):223. doi: 10.1186/gb-2004-5-5-223. Epub 2004 Apr 16. Genome Biol. 2004. PMID: 15128439 Free PMC article. Review.
-
Sequencing the maize genome.Curr Opin Plant Biol. 2004 Apr;7(2):102-7. doi: 10.1016/j.pbi.2004.01.010. Curr Opin Plant Biol. 2004. PMID: 15003207 Review.
Cited by
-
Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum.Plant J. 2014 Jul;79(1):162-72. doi: 10.1111/tpj.12539. Epub 2014 Jun 17. Plant J. 2014. PMID: 24773339 Free PMC article.
-
SNP discovery via 454 transcriptome sequencing.Plant J. 2007 Sep;51(5):910-8. doi: 10.1111/j.1365-313X.2007.03193.x. Epub 2007 Jul 27. Plant J. 2007. PMID: 17662031 Free PMC article.
-
Sequencing and analysis of the gene-rich space of cowpea.BMC Genomics. 2008 Feb 27;9:103. doi: 10.1186/1471-2164-9-103. BMC Genomics. 2008. PMID: 18304330 Free PMC article.
-
Quality assessment of maize assembled genomic islands (MAGIs) and large-scale experimental verification of predicted genes.Proc Natl Acad Sci U S A. 2005 Aug 23;102(34):12282-7. doi: 10.1073/pnas.0503394102. Epub 2005 Aug 15. Proc Natl Acad Sci U S A. 2005. PMID: 16103354 Free PMC article.
-
Gene mapping via bulked segregant RNA-Seq (BSR-Seq).PLoS One. 2012;7(5):e36406. doi: 10.1371/journal.pone.0036406. Epub 2012 May 7. PLoS One. 2012. PMID: 22586469 Free PMC article.
References
-
- Bailey J, Gu Z, Clark R, Reinert K, Samonte R, Schwartz S, Adams M, Myers E, Li P, Eichler E (2002) Recent segmental duplications in the human genome. Science 297: 1003–1007 - PubMed
-
- Brown KR, Weatherdon KL, Galligan CL, Skalski V (2002) A nuclear 3′-5′ exonuclease proofreads for the exonuclease-deficient DNA polymerase alpha. DNA Repair (Amst) 1: 795–810 - PubMed
-
- Chou HH, Holmes MH (2001) DNA sequence quality trimming and vector removal. Bioinformatics 17: 1093–1104 - PubMed
-
- Emrich SJ, Aluru S, Fu Y, Wen TJ, Narayanan M, Guo L, Ashlock D, Schnable PS (2004) A strategy for assembling the maize (Zea mays L.) genome. Bioinformatics 20: 140–147 - PubMed
Publication types
MeSH terms
Substances
Associated data
- Actions
- Actions
- Actions
- Actions
- Actions
- Actions
- Actions
- Actions
- Actions
- Actions
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials