Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009;10(9):R94.
doi: 10.1186/gb-2009-10-9-r94. Epub 2009 Sep 11.

De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data

Affiliations

De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data

Scott Diguistini et al. Genome Biol. 2009.

Abstract

Sequencing-by-synthesis technologies can reduce the cost of generating de novo genome assemblies. We report a method for assembling draft genome sequences of eukaryotic organisms that integrates sequence information from different sources, and demonstrate its effectiveness by assembling an approximately 32.5 Mb draft genome sequence for the forest pathogen Grosmannia clavigera, an ascomycete fungus. We also developed a method for assessing draft assemblies using Illumina paired end read data and demonstrate how we are using it to guide future sequence finishing. Our results demonstrate that eukaryotic genome sequences can be accurately assembled by combining Illumina, 454 and Sanger sequence data.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Assembly process overview. Overview of the process for producing de novo assemblies.
Figure 2
Figure 2
Consensus sequence quality. The proportion of 454 read data within the total read collection affected the number of small insertions and deletions (indels) based on analysis of 7,169 unique EST-to-genome alignments. The relative proportions of insertions (blue) and deletions (orange) in the assembly sequence are shown in the inset pie chart. Assemblies are described in Tables 1 and 2; those including 454 read data were assembled with Forge; the Illumina-only assembly was generated with Velvet.
Figure 3
Figure 3
Comparison of Forge Sanger/454/Illumina assemblies against GCgb1. Alignments of scaffolds greater that 100 kb - (a) 'Sanger/454/IlluminaDA' (approximately 24 Mb on 80 scaffolds) and (b) 'Sanger/454/IlluminaPA' (approximately 28.7 Mb on 46 scaffolds) - on the y-axis against the manually finished genome sequence (GCgb1) on the x-axis.
Figure 4
Figure 4
Assessing the discovery of unique read information between the Illumina and 454 platforms. (a) Raw reads were processed into overlapping 28-bp k-mers, and any k-mer that varied from all other k-mers by at least 1 bp was accepted as new sequence information. The analysis was done separately for unique k-mers and those that occurred at least twice (2× k-mers). (b) MAQ was then used to map these k-mers to the reference genome sequence and the rate at which new coverage was generated was plotted against the number of k-mers examined.

References

    1. Huse SM, Huber JA, Morrison HG, Sogin ML, Welch DM. Accuracy and quality of massively-parallel DNA pyrosequencing. Genome Biol. 2007;8:R143. doi: 10.1186/gb-2007-8-7-r143. - DOI - PMC - PubMed
    1. Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB. ALLPATHS: De novo assembly of whole-genome shotgun microreads. Genome Res. 2008;18:810–820. doi: 10.1101/gr.7337908. - DOI - PMC - PubMed
    1. Warren R, Sutton G, Jones S, Holt R. Assembling millions of short DNA sequences using SSAKE. Bioinformatics. 2007;23:500–501. doi: 10.1093/bioinformatics/btl629. - DOI - PMC - PubMed
    1. Zerbino D, Birney E. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–829. doi: 10.1101/gr.074492.107. - DOI - PMC - PubMed
    1. Simpson J, Wong K, Jackman S, Schein J, Jones SJM, Birol I. ABySS: A parallel assembler for short read sequence data. Genome Res. 2009;19:1117–1123. doi: 10.1101/gr.089532.108. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources