Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jan 2:15:1.
doi: 10.1186/1471-2164-15-1.

A genome-wide association study of seed protein and oil content in soybean

Affiliations

A genome-wide association study of seed protein and oil content in soybean

Eun-Young Hwang et al. BMC Genomics. .

Abstract

Background: Association analysis is an alternative to conventional family-based methods to detect the location of gene(s) or quantitative trait loci (QTL) and provides relatively high resolution in terms of defining the genome position of a gene or QTL. Seed protein and oil concentration are quantitative traits which are determined by the interaction among many genes with small to moderate genetic effects and their interaction with the environment. In this study, a genome-wide association study (GWAS) was performed to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content.

Results: A total of 55,159 single nucleotide polymorphisms (SNPs) were genotyped using various methods including Illumina Infinium and GoldenGate assays and 31,954 markers with minor allele frequency >0.10 were used to estimate linkage disequilibrium (LD) in heterochromatic and euchromatic regions. In euchromatic regions, the mean LD (r2) rapidly declined to 0.2 within 360 Kbp, whereas the mean LD declined to 0.2 at 9,600 Kbp in heterochromatic regions. The GWAS results identified 40 SNPs in 17 different genomic regions significantly associated with seed protein. Of these, the five SNPs with the highest associations and seven adjacent SNPs were located in the 27.6-30.0 Mbp region of Gm20. A major seed protein QTL has been previously mapped to the same location and potential candidate genes have recently been identified in this region. The GWAS results also detected 25 SNPs in 13 different genomic regions associated with seed oil. Of these markers, seven SNPs had a significant association with both protein and oil.

Conclusions: This research indicated that GWAS not only identified most of the previously reported QTL controlling seed protein and oil, but also resulted in narrower genomic regions than the regions reported as containing these QTL. The narrower GWAS-defined genome regions will allow more precise marker-assisted allele selection and will expedite positional cloning of the causal gene(s).

PubMed Disclaimer

Figures

Figure 1
Figure 1
The mean level of LD in heterochromatic and euchromatic chromosome regions. The mean LD was estimated using all pairs of loci located within 20 Mbp of physical distance. The X-axis indicates the distance between marker pairs and the Y-axis indicates LD level. The green and purple lines respectively denote mean D’ and mean r2 in euchromatic regions, and the blue and red lines respectively denote mean D’ and mean r2 in heterochromatic regions.
Figure 2
Figure 2
Seed protein and oil concentration in the GRIN database vs. that determined in this study. Seed protein (A) and seed oil (B) concentrations of the soybean germplasm accessions, respectively, reported in the GRIN database vs. the percentage determined in this study from seed harvested from two-replicate trials conducted at two locations (Beltsville, MD; and Lincoln, NE) in 2003. Blue bars are data from the GRIN database and red bars are data from this study.
Figure 3
Figure 3
GWAS for seed protein and oil concentration. Manhattan plots depicting the extent of the association of 31,954 SNPs, dispersed as shown over the 20 soybean chromosomes, with (A) mean seed protein content and (B) mean seed oil content of the soybean accessions, respectively. The-log P value is a measure of the degree to which a SNP is associated with the trait. SNP spikes topped by a red asterisk (*) denote a genomic region aligning with the location of a previously reported QTL, whereas those topped with the letter N denote that the region may harbor a heretofore unreported QTL. The vertical yellow bars spanning both graphs denote specific markers or a few markers in close proximity that exhibit significant association with both seed protein and oil content.
Figure 4
Figure 4
The candidate region of the major seed protein QTL on Gm20. Plots depicting an 8.4 Mbp region of Gm20 that is suspected of harboring a candidate gene responsible for a major pleiotropic soybean seed protein/oil QTL. Panel A) depicts the 12 potential candidate genes (Glyma names) and the physical positions of the genes indicated with green vertical lines as reported by Bolon et al. [36] in the 8.4 Mbp region that, based upon the GWAS findings in this report, can be narrowed to a 2.4 Mbp region that harbors only six (yellow-highlighted) of the 12 candidate genes. The physical positions of the 12 SNPs associated with protein content are indicated with blue vertical lines. Panel B) depicts the extent of LD in this region based on r2 (in black) and Panel C) depicts LD based on the D’ (in red). Note: The original 12 gene models whose positions are shown in the 8.4 Mbp region in the top panel were differentially expressed in a pair of high vs. low seed protein soybean near-isogenic lines–see Bolon et al. [36] for additional details.

References

    1. Knowler WC, Williams RC, Pettitt DJ, Steinberg AG. Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am J Hum Genet. 1988;43(4):520–526. - PMC - PubMed
    1. Thornsberry JM, Goodman MM, Doebley J, Kresovich S, Nielsen D, Buckler ES IV. Dwarf8 polymorphisms associate with variation in flowering time. Nat Genet. 2001;28(3):286–289. doi: 10.1038/90135. - DOI - PubMed
    1. Palaisa KA, Morgante M, Williams M, Rafalski A. Contrasting effects of selection on sequence diversity and linkage disequilibrium at two phytoene synthase loci. Plant Cell. 2003;15(8):1795–1806. doi: 10.1105/tpc.012526. - DOI - PMC - PubMed
    1. Wilson LM, Whitt SR, Ibanez AM, Rocheford TR, Goodman MM, Buckler ES IV. Dissection of maize kernel composition and starch production by candidate gene association. Plant Cell. 2004;16(10):2719–2733. doi: 10.1105/tpc.104.025700. - DOI - PMC - PubMed
    1. Urbany C, Stich B, Schmidt L, Simon L, Berding H, Junghans H, Niehoff KH, Braun A, Tacke E, Hofferbert HR. et al.Association genetics in Solanum tuberosum provides new insights into potato tuber bruising and enzymatic tissue discoloration. BMC Genomics. 2011;12:7–20. doi: 10.1186/1471-2164-12-7. - DOI - PMC - PubMed

Publication types