Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 May 4;6(5):e19379.
doi: 10.1371/journal.pone.0019379.

A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species

Affiliations

A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species

Robert J Elshire et al. PLoS One. .

Abstract

Advances in next generation technologies have driven the costs of DNA sequencing down to the point that genotyping-by-sequencing (GBS) is now feasible for high diversity, large genome species. Here, we report a procedure for constructing GBS libraries based on reducing genome complexity with restriction enzymes (REs). This approach is simple, quick, extremely specific, highly reproducible, and may reach important regions of the genome that are inaccessible to sequence capture approaches. By using methylation-sensitive REs, repetitive regions of genomes can be avoided and lower copy regions targeted with two to three fold higher efficiency. This tremendously simplifies computationally challenging alignment problems in species with high levels of genetic diversity. The GBS procedure is demonstrated with maize (IBM) and barley (Oregon Wolfe Barley) recombinant inbred populations where roughly 200,000 and 25,000 sequence tags were mapped, respectively. An advantage in species like barley that lack a complete genome sequence is that a reference map need only be developed around the restriction sites, and this can be done in the process of sample genotyping. In such cases, the consensus of the read clusters across the sequence tagged sites becomes the reference. Alternatively, for kinship analyses in the absence of a reference genome, the sequence tags can simply be treated as dominant markers. Future application of GBS to breeding, conservation, and global species and population surveys may allow plant breeders to conduct genomic selection on a novel germplasm or species without first having to develop any prior molecular tools, or conservation biologists to determine population structure without prior knowledge of the genome or diversity in the species.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. GBS adapters, PCR and sequencing primers.
(a) Sequences of double-stranded barcode and common adapters. Adapters are shown ligated to ApeKI-cut genomic DNA. Positions of the barcode sequence and ApeKI overhangs are shown relative to the insert DNA; (b) Sequences of PCR primer 1 and paired end sequencing primer 1 (PE-1). Binding sites for flowcell oligonucleotide 1 and barcode adapter are indicated; (c) Sequences of PCR primer 2 and paired end sequencing primer 2 (PE-2). Binding sites for flowcell oligonucleotide 2 and common adapter are indicated.
Figure 2
Figure 2. Steps in GBS library construction.
Note: Up to 96 DNA samples can be processed simultaneously. (1) DNA samples, barcode, and common adapter pairs are plated and dried; (2–3) samples are then digested with ApeKI and adapters are ligated to the ends of genomic DNA fragments; (4) T4 ligase is inactivated by heating and an aliquot of each sample is pooled and applied to a size exclusion column to remove unreacted adapters; (5) appropriate primers with binding sites on the ligated adapters are added and PCR is performed to increase the fragment pool; (6–7) PCR products are cleaned up and fragment sizes of the resulting library are checked on a DNA analyzer(BioRad Experion® or similar instrument). Libraries without adapter dimers are retained for DNA sequencing.
Figure 3
Figure 3. Fragment size distributions of a virtual ApeKI digest of the maize genome and unique (single-copy) ApeKI sequence tags from the maize IBM mapping population.
Note that for size bins on the x-axis “50” denotes a bin of size 1–50 bp, “100” denotes a bin of size 51–100 bp, etc. The reference genome employed for the maize virtual digest was B73 RefGen v1.
Figure 4
Figure 4. Coefficient of variation of GBS reads per sequencing channel for sequential sequencing runs.
Each flow cell comprised 6 or 7 sequencing channels. Large boxes represent the standard deviation of the number of reads per sample; whiskers denote minimum and maximum values; small squares are the median values; and lines extending across the boxes are the means for each run. Flow cells are ordered sequentially by run date; number 1 is the first sequencing run and number 11 denotes the last run. The GBS read data from the maize IBM population is contained in flow cell 1. The large variation in reads per sample from this flowcell was due to inconsistent pipetting during robotic liquid handling. Subsequent adjustments to our robotic protocols improved evenness among samples (see flowcells 2–11).
Figure 5
Figure 5. Distribution of reads across 43 barcoded samples in a single flow cell lane for the Oregon Wolfe Barley population.
Figure 6
Figure 6. Barley GBS marker validation using a single DH line (OWB003).
Upright triangles denote positions of markers on the reference genetic map and downward triangles indicate GBS reads mapped in this study. Multiple sequence reads are stacked and colors indicate chromosomal segments in OWBOO3 originating from dominant (blue) or recessive (red) parental lines.

Similar articles

Cited by

References

    1. Li W-H, Sadler LA. Low nucleotide diversity in man. Genetics. 1991;129:513–523. - PMC - PubMed
    1. Przeworski M, Hudson RR, Di Rienzo A. Adjusting the focus on human variation. Trends Genet. 2000;16:296–302. - PubMed
    1. Zhao Z, Yu N, Fu Y-X, Li H. Nucleotide variation and haplotype diversity in a 10-kb noncoding region in three continental human populations. Genetics. 2006;174:399–409. - PMC - PubMed
    1. Tenaillon MI, Sawkins MC, Anderson LK, Stack J, Doebley JF, et al. Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp mays L.). Proc Natl Acad Sci USA. 2001;98:9169–9166. - PMC - PubMed
    1. Yan J, Shah T, Warburton ML, Buckler ES, McMullen MD, et al. Genetic characterization and linkage disequilibrium estimation of a global maize collection using SNP markers. PLoS One. 2009;4:e8451. - PMC - PubMed

Publication types