Effort required to finish shotgun-generated genome sequences differs significantly among vertebrates
- PMID: 20064230
- PMCID: PMC2827409
- DOI: 10.1186/1471-2164-11-21
Effort required to finish shotgun-generated genome sequences differs significantly among vertebrates
Abstract
Background: The approaches for shotgun-based sequencing of vertebrate genomes are now well-established, and have resulted in the generation of numerous draft whole-genome sequence assemblies. In contrast, the process of refining those assemblies to improve contiguity and increase accuracy (known as 'sequence finishing') remains tedious, labor-intensive, and expensive. As a result, the vast majority of vertebrate genome sequences generated to date remain at a draft stage.
Results: To date, our genome sequencing efforts have focused on comparative studies of targeted genomic regions, requiring sequence finishing of large blocks of orthologous sequence (average size 0.5-2 Mb) from various subsets of 75 vertebrates. This experience has provided a unique opportunity to compare the relative effort required to finish shotgun-generated genome sequence assemblies from different species, which we report here. Importantly, we found that the sequence assemblies generated for the same orthologous regions from various vertebrates show substantial variation with respect to misassemblies and, in particular, the frequency and characteristics of sequence gaps. As a consequence, the work required to finish different species' sequences varied greatly. Application of the same standardized methods for finishing provided a novel opportunity to "assay" characteristics of genome sequences among many vertebrate species. It is important to note that many of the problems we have encountered during sequence finishing reflect unique architectural features of a particular vertebrate's genome, which in some cases may have important functional and/or evolutionary implications. Finally, based on our analyses, we have been able to improve our procedures to overcome some of these problems and to increase the overall efficiency of the sequence-finishing process, although significant challenges still remain.
Conclusion: Our findings have important implications for the eventual finishing of the draft whole-genome sequences that have now been generated for a large number of vertebrates.
Figures




Similar articles
-
Finishing the finished human chromosome 22 sequence.Genome Biol. 2008;9(5):R78. doi: 10.1186/gb-2008-9-5-r78. Epub 2008 May 13. Genome Biol. 2008. PMID: 18477386 Free PMC article.
-
An intermediate grade of finished genomic sequence suitable for comparative analyses.Genome Res. 2004 Nov;14(11):2235-44. doi: 10.1101/gr.2648404. Epub 2004 Oct 12. Genome Res. 2004. PMID: 15479945 Free PMC article.
-
Bridging the Gap between Vertebrate Cytogenetics and Genomics with Single-Chromosome Sequencing (ChromSeq).Genes (Basel). 2021 Jan 19;12(1):124. doi: 10.3390/genes12010124. Genes (Basel). 2021. PMID: 33478118 Free PMC article. Review.
-
Discovery of regulatory elements in vertebrates through comparative genomics.Nat Biotechnol. 2005 Oct;23(10):1249-56. doi: 10.1038/nbt1140. Nat Biotechnol. 2005. PMID: 16211068
-
Using optical mapping data for the improvement of vertebrate genome assemblies.Gigascience. 2015 Mar 18;4:10. doi: 10.1186/s13742-015-0052-y. eCollection 2015. Gigascience. 2015. PMID: 25789164 Free PMC article. Review.
Cited by
-
Rapid genome mapping in nanochannel arrays for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome.PLoS One. 2013;8(2):e55864. doi: 10.1371/journal.pone.0055864. Epub 2013 Feb 6. PLoS One. 2013. PMID: 23405223 Free PMC article.
-
Lineage-specific evolution of the vertebrate Otopetrin gene family revealed by comparative genomic analyses.BMC Evol Biol. 2011 Jan 24;11:23. doi: 10.1186/1471-2148-11-23. BMC Evol Biol. 2011. PMID: 21261979 Free PMC article.
-
Evaluation of methods for de novo genome assembly from high-throughput sequencing reads reveals dependencies that affect the quality of the results.PLoS One. 2011;6(9):e24182. doi: 10.1371/journal.pone.0024182. Epub 2011 Sep 7. PLoS One. 2011. PMID: 21915294 Free PMC article.
-
Genomic organization, evolution, and expression of photoprotein and opsin genes in Mnemiopsis leidyi: a new view of ctenophore photocytes.BMC Biol. 2012 Dec 21;10:107. doi: 10.1186/1741-7007-10-107. BMC Biol. 2012. PMID: 23259493 Free PMC article.
-
Recent and historical recombination in the admixed Norwegian Red cattle breed.BMC Genomics. 2011 Jan 14;12:33. doi: 10.1186/1471-2164-12-33. BMC Genomics. 2011. PMID: 21232164 Free PMC article.
References
-
- Wilson RK, Mardis ER. In: Genome Analysis: A laboratory manual: Analyzing DNA. Birren B, Green ED, Klapholz S, Myers RM, Roskams J, editor. Vol. 1. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press; 1997. Shotgun sequencing; pp. 397–454.
-
- Blakesley RW, Hansen NF, Mullikin JC, Thomas PJ, McDowell JC, Maskeri B, Young AC, Benjamin B, Brooks SY, Coleman BI, Gupta J, Ho S-L, Karlins EM, Maduro QL, Stantripop S, Tsurgeon C, Vogt JL, Walker MA, Masiello CA, Guan X. NISC Comparative Sequencing Program. Bouffard GG, Green ED. An intermediate grade of finished genomic sequence suitable for comparative analysis. Genome Res. 2004;14:2235–2244. doi: 10.1101/gr.2648404. - DOI - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous