Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr;132(4):1179-1193.
doi: 10.1007/s00122-018-3271-7. Epub 2018 Dec 26.

Genetic diversity patterns and domestication origin of soybean

Affiliations

Genetic diversity patterns and domestication origin of soybean

Soon-Chun Jeong et al. Theor Appl Genet. 2019 Apr.

Abstract

Genotyping data of a comprehensive Korean soybean collection obtained using a large SNP array were used to clarify global distribution patterns of soybean and address the evolutionary history of soybean. Understanding diversity and evolution of a crop is an essential step to implement a strategy to expand its germplasm base for crop improvement research. Accessions intensively collected from Korea, which is a small but central region in the distribution geography of soybean, were genotyped to provide sufficient data to underpin population genetic questions. After removing natural hybrids and duplicated or redundant accessions, we obtained a non-redundant set comprising 1957 domesticated and 1079 wild accessions to perform population structure analyses. Our analysis demonstrates that while wild soybean germplasm will require additional sampling from diverse indigenous areas to expand the germplasm base, the current domesticated soybean germplasm is saturated in terms of genetic diversity. We then showed that our genome-wide polymorphism map enabled us to detect genetic loci underlying flower color, seed-coat color, and domestication syndrome. A representative soybean set consisting of 194 accessions was divided into one domesticated subpopulation and four wild subpopulations that could be traced back to their geographic collection areas. Population genomics analyses suggested that the monophyletic group of domesticated soybeans was likely originated at a Japanese region. The results were further substantiated by a phylogenetic tree constructed from domestication-associated single nucleotide polymorphisms identified in this study.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest

The authors declare that they have no conflict of interest.

Data accessibility

SNP genotype data are listed in Table S1 and are publicly available at Korean Soya Base (http://koreansoyabase.org/Data_Resource/).

Figures

Fig. 1
Fig. 1
Population structure of the genotyped 4234 soybean accessions. a Principal components of SNP variation. PC1 and PC2 indicate score of principal components 1 and 2, respectively. Each of PC1 and PC2 explained 15.6% and 2.7% of variance in the data. Glycine max, G. soja, and hybrids are shown by red, blue, and green dots, respectively. The majority of Korean accessions cluster together within dashed eclipses. b ADMIXTURE plots. The accessions were divided into three groups: G. max, G. soja, and their hybrids. c Distribution of number of redundant accession groups that showed < 1.25% inconsistencies between the SNP calls. G. max and G. soja are shown by white and gray boxes. d Geographic distribution of the collection sites for G. soja accessions
Fig. 2
Fig. 2
Population structures of 1957 domesticated and 1079 wild soybean accessions in the 3036 non-redundant soybean accession set. a Principal components (PC) of SNP variation in the domesticated population. The plots show the first three principal components. The countries of collection or improvement status of the soybean accessions in a and c are represented by two-letter codes—CN, China; IP, improved breeding line; JP, Japan; ND, not determined; NK, North Korea; RS, Russia; and SK (KR), South Korea. b Scree plot of the PC number and their contribution to variance from principal component analysis of the domesticated accessions. c Principal components of SNP variation in the wild population. The plots show the first three principal components. A cluster of accessions from Jeju Island is indicated by a dashed eclipse. d Scree plot of the PC number and their contribution to variance from principal component analysis of the wild accessions
Fig. 3
Fig. 3
Genome-wide association scans for 3036 soybean accessions for flower color, seed-coat color, and domestication. a Manhattan plot for flower color. The solid horizontal line denotes the Bonferroni-adjusted significance threshold. Chromosomal regions of known genes (T, I, R, W1) or loci (PD05 and qSW) are indicated by dashed vertical lines. b Manhattan plot for seed-coat color. c Manhattan plot for domestication. d Local Manhattan plot (top) and LD heatmap (bottom) surrounding the T locus on chromosome 6. Dashed lines indicate the region of the T locus. Physical locations (kb) are indicated under the Manhattan plot. e Local Manhattan plot (top) and LD heatmap (bottom) surrounding the qSW locus on chromosome 17. A bar indicates the region of the qSW locus
Fig. 4
Fig. 4
Identification of the domestication center of G. max. a, b Principal components plots of SNP variation. PC1, PC2, and PC3 indicate score of principal components 1, 2, and 3, respectively. Each of PC1, PC2, and PC3 explained 12.0, 5.2, and 2.6% of variance in the data. Countries of collection of the soybean accessions and species names are represented by two-letter codes—CN, China; JP, Japan; NK, North Korea; RS, Russia; SK, South Korea; Gm, G. max; and Gs, G. soja. A putative hybrid PI 549046 is labeled. c Population structure of 50 G. max (Gm) and 144 G. soja (Gs-I, Gs-II, Gs-III, and Gs-IV) accessions inferred using ADMIXTURE. Each color represents one population. PI 549046 showed ~ 20% of ancestral genomic fractions from G. max. d Geographic distribution of the four G. soja subgroups. Gs-I is red, Gs-II green, Gs-III orange, and Gs-IV blue. e–g Neighbor-joining phylogenetic tree of 194 soybean accessions based on the SNPs genotyped by the 180 K AXIOM SoyaSNP array, with evolutionary distances measured by the p distance. The taxa used in the neighbor-joining tree and bootstrap values from 1000 bootstrap replications at branches are described in Fig. S9. e Phylogenetic tree based on 117,095 SNPs. f Phylogenetic tree based on 108,899 SNPs, which are weakly or not significantly associated with domestication traits. g Phylogenetic tree based on 8196, which are very significantly associated with domestication traits. PI 549046 from group Gs-I clusters between Gm and Gs-IV likely because of contribution of ancestral genomic fraction from Gm

References

    1. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19:1655–1664. doi: 10.1101/gr.094052.109. - DOI - PMC - PubMed
    1. Badr A, Muller K, Schafer-Pregl R, El Rabey H, Effgen S, Ibrahim HH, Pozzi C, Rohde W, Salamini F. On the origin and domestication history of Barley (Hordeum vulgare) Mol Biol Evol. 2000;17:499–510. doi: 10.1093/oxfordjournals.molbev.a026330. - DOI - PubMed
    1. Bandillo N, Jarquin D, Song Q, Nelson RL, Cregan P, Specht J, Lorenz A. A population structure and genome-wide association analysis on the USDA soybean germplasm collection. Plant Genome. 2015 - PubMed
    1. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21:263–265. doi: 10.1093/bioinformatics/bth457. - DOI - PubMed
    1. Bitocchi E, Nanni L, Bellucci E, Rossi M, Giardini A, Zeuli PS, Logozzo G, Stougaard J, McClean P, Attene G, Papa R. Mesoamerican origin of the common bean (Phaseolus vulgaris L.) is revealed by sequence data. Proc Natl Acad Sci USA. 2012;109:E788–E796. doi: 10.1073/pnas.1108973109. - DOI - PMC - PubMed

LinkOut - more resources