Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study
- PMID: 21620904
- DOI: 10.1016/j.mimet.2011.05.008
Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study
Abstract
Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de novo strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain I19, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de novo strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former.
Copyright © 2011 Elsevier B.V. All rights reserved.
Similar articles
-
Short reads and nonmodel species: exploring the complexities of next-generation sequence assembly and SNP discovery in the absence of a reference genome.Mol Ecol Resour. 2011 Mar;11 Suppl 1:93-108. doi: 10.1111/j.1755-0998.2010.02969.x. Mol Ecol Resour. 2011. PMID: 21429166
-
High efficiency application of a mate-paired library from next-generation sequencing to postlight sequencing: Corynebacterium pseudotuberculosis as a case study for microbial de novo genome assembly.J Microbiol Methods. 2013 Dec;95(3):441-7. doi: 10.1016/j.mimet.2013.06.006. Epub 2013 Jun 21. J Microbiol Methods. 2013. PMID: 23792707
-
De novo assembly of the Pseudomonas syringae pv. syringae B728a genome using Illumina/Solexa short sequence reads.FEMS Microbiol Lett. 2009 Feb;291(1):103-11. doi: 10.1111/j.1574-6968.2008.01441.x. Epub 2008 Dec 9. FEMS Microbiol Lett. 2009. PMID: 19077061
-
Next-generation sequencing technologies and fragment assembly algorithms.Methods Mol Biol. 2012;855:155-74. doi: 10.1007/978-1-61779-582-4_5. Methods Mol Biol. 2012. PMID: 22407708 Review.
-
The present and future of de novo whole-genome assembly.Brief Bioinform. 2018 Jan 1;19(1):23-40. doi: 10.1093/bib/bbw096. Brief Bioinform. 2018. PMID: 27742661 Review.
Cited by
-
Complete genome sequence of Corynebacterium pseudotuberculosis strain CIP 52.97, isolated from a horse in Kenya.J Bacteriol. 2011 Dec;193(24):7025-6. doi: 10.1128/JB.06293-11. J Bacteriol. 2011. PMID: 22123771 Free PMC article.
-
Genome sequence of the Corynebacterium pseudotuberculosis Cp316 strain, isolated from the abscess of a Californian horse.J Bacteriol. 2012 Dec;194(23):6620-1. doi: 10.1128/JB.01616-12. J Bacteriol. 2012. PMID: 23144380 Free PMC article.
-
Complete genome sequence of Corynebacterium pseudotuberculosis Cp31, isolated from an Egyptian buffalo.J Bacteriol. 2012 Dec;194(23):6663-4. doi: 10.1128/JB.01782-12. J Bacteriol. 2012. PMID: 23144408 Free PMC article.
-
Complete genome sequence of Corynebacterium pseudotuberculosis biovar ovis strain P54B96 isolated from antelope in South Africa obtained by rapid next generation sequencing technology.Stand Genomic Sci. 2012 Dec 19;7(2):189-99. doi: 10.4056/sigs.3066455. Epub 2012 Dec 15. Stand Genomic Sci. 2012. PMID: 23408795 Free PMC article.
-
Whole-genome sequence of Corynebacterium pseudotuberculosis PAT10 strain isolated from sheep in Patagonia, Argentina.J Bacteriol. 2011 Nov;193(22):6420-1. doi: 10.1128/JB.06044-11. J Bacteriol. 2011. PMID: 22038974 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous