Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Mar;10(1):1-8.
doi: 10.5808/GI.2012.10.1.1. Epub 2012 Mar 31.

Survey of the Applications of NGS to Whole-Genome Sequencing and Expression Profiling

Affiliations

Survey of the Applications of NGS to Whole-Genome Sequencing and Expression Profiling

Jong-Sung Lim et al. Genomics Inform. 2012 Mar.

Abstract

Recently, the technologies of DNA sequence variation and gene expression profiling have been used widely as approaches in the expertise of genome biology and genetics. The application to genome study has been particularly developed with the introduction of the next-generation DNA sequencer (NGS) Roche/454 and Illumina/Solexa systems, along with bioinformation analysis technologies of whole-genome de novo assembly, expression profiling, DNA variation discovery, and genotyping. Both massive whole-genome shotgun paired-end sequencing and mate paired-end sequencing data are important steps for constructing de novo assembly of novel genome sequencing data. It is necessary to have DNA sequence information from a multiplatform NGS with at least 2× and 30× depth sequence of genome coverage using Roche/454 and Illumina/Solexa, respectively, for effective an way of de novo assembly. Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing. Whole-genome expression profile data are useful to approach genome system biology with quantification of expressed RNAs from a whole-genome transcriptome, depending on the tissue samples. The hybrid mRNA sequences from Rohce/454 and Illumina/Solexa are more powerful to find novel genes through de novo assembly in any whole-genome sequenced species. The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences. However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

Keywords: NGS; de novo assembly; expression profiling; multiplatform; resequencing; whole genome.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Stringency of whole-genome DNA shotgun sequencing for novel genome. Whole-genome shotgun sequencing (left of the Figure) for contig construction: Sequencing of single-end or paired-end fragment of whole-genome DNA shotgun library, which are made with the average range of 200-800-bp fragments for Roche/454 or Illumina/Solexa systems. In general, producing a total DNA sequence amount of 15-20× and 60× coverage in depth of a genome depends on using Roche/454 or Illumina/Solexa, respectively. Whole-genome shotgun mate paired-end sequencing for scaffold construction (right of the Figure): sequencing of the mate paired-end fragments of the whole-genome DNA shotgun library, which are made with the average range of 2-40-Kb fragments for next-generation DNA sequencer. The sequencing amount of more than 20× coverage in depth of the genome is effective for scaffold constructions.
Fig. 2
Fig. 2
Integrated pipeline for de novo assembly of novel genome sequencing. The scheme is filtering data to remove low-quality and shot-read initial assemblies using variable software and compare to contigs, hybrid contigs using MIRA assembler, and contig ordering using SSPACE software to scaffold construction.
Fig. 3
Fig. 3
View of single nucleotide polymorphism (SNP) discovery through mapping short reads from Illumina/Solexa to reference sequence on MAQ software (A) and CLC software (B). (A) Short read 35 bp per read of soybean genome shows completely mapped on the soybean reference sequence. The MAQ software provides a consensus sequence of the genotype sequenced of short read lengths with aligned raw reads to the reference sequence. (B) CLC software is useful for counting reads with DNA variations at each position.
Fig. 4
Fig. 4
A scheme of transcriptome expression analysis through massively parallel signature sequencing (MPSS) technology and bioinformatics: The identification of expressed genes through hybrid de novo assembly with Roche/454 and Illumina/Solexa data (left) and expressed level profiling through mapping the Illumina/Solexa sequence to the expressed sequence tag reference.

References

    1. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437:376–380. - PMC - PubMed
    1. Jiang Y, Lu J, Peatman E, Kucuktas H, Liu S, Wang S, et al. A pilot study for channel catfish whole genome sequencing and de novo assembly. BMC Genomics. 2011;12:629. - PMC - PubMed
    1. Li J, Jiang J, Leung FC. 6-10x pyrosequencing is a practical approach for whole prokaryote genome studies. Gene. 2012;494:57–64. - PubMed
    1. Cleary DF, Smalla K, Mendonça-Hagler LC, Gomes NC. Assessment of variation in bacterial composition among microhabitats in a mangrove environment using DGGE fingerprints and barcoded pyrosequencing. PLoS One. 2012;7:e29380. - PMC - PubMed
    1. Hong PY, Croix JA, Greenberg E, Gaskins HR, Mackie RI. Pyrosequencing-based analysis of the mucosal microbiota in healthy individuals reveals ubiquitous bacterial groups and micro-heterogeneity. PLoS One. 2011;6:e25042. - PMC - PubMed

LinkOut - more resources