Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr;110(1):114-128.
doi: 10.1111/tpj.15658. Epub 2022 Feb 10.

Fine mapping and cloning of the major seed protein quantitative trait loci on soybean chromosome 20

Affiliations

Fine mapping and cloning of the major seed protein quantitative trait loci on soybean chromosome 20

Christina E Fliege et al. Plant J. 2022 Apr.

Abstract

Soybean is the most important source of protein meal worldwide and the quantitative trait loci (QTL) cqSeed protein‐003 on chromosome 20 exerts the greatest additive effect of any protein QTL mapped in the crop. Through genetic mapping and candidate gene downregulation, we identified that an insertion/deletion variant in Glyma.20G85100 is the likely gene that underlies this important QTL.

Soybean [Glycine max (L.) Merr.] is a unique crop species because it has high levels of both protein and oil in its seed. Of the many quantitative trait loci (QTL) controlling soybean seed protein content, alleles of the cqSeed protein-003 QTL on chromosome 20 exert the greatest additive effect. The high-protein allele exists in both cultivated and wild soybean (Glycine soja Siebold & Zucc.) germplasm. Our objective was to fine map this QTL to enable positional-based cloning of its underlying causative gene(s). Fine mapping was achieved by developing and testing a series of populations in which the chromosomal region surrounding the segregating high- versus low-protein alleles was gradually narrowed, using marker-based detection of recombinant events. The resultant 77.8 kb interval was directly sequenced from a G. soja source and compared with the reference genome to identify structural and sequence polymorphisms. An insertion/deletion variant detected in Glyma.20G85100 was found to have near-perfect +/− concordance with high/low-protein allele genotypes inferred for this QTL in parents of published mapping populations. The indel structure was concordant with an evolutionarily recent insertion of a TIR transposon into the gene in the low-protein lineage. Seed protein was significantly greater in soybean expressing an RNAi hairpin downregulation element in two independent events relative to control null segregant lineages. We conclude that a transposon insertion within the CCT domain protein encoded by the Glyma.20G85100 gene accounts for the high/low seed protein alleles of the cqSeed protein-003 QTL.

Keywords: Glycine max (L.) Merr.; QTL; fine mapping; gene cloning; seed protein; soybean.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Molecular characterization of the genetic interval of the cqSeed protein‐003 interval. (a) Diagram showing the position of assembled scaffolds within the interval, and annotated genes. Region to the right of the interval contains a repetitive sequence that does not code for protein. (b) Expanded view of Glyma.20G085100, showing the respective 321 bp insertion in Williams 82 compared with the PI 468916 sequence. (c) Diagram showing the primer combination used to develop a codominant marker for this polymorphism. (d) Agarose gel showing genotyping of part of the population of soybean accessions in Table 3 using the marker.
Figure 2
Figure 2
Population biology of the insertion/deletion (indel) polymorphism at Glyma.20G085100. (a) Linkage disequilibrium (LD) around the Glyma.20G085100 locus. Left panel: using a population of diverse accessions (Tables 3 and S4) the R 2 LD values around the locus were calculated using data from the SoySNP50k array (Song et al. 2013); note there is a region of LD surrounding the Glyma.20G085100 gene indicated by the arrow (left panel). Right panel: indel genotype determined with a polymerase chain reaction‐based marker detecting the insertional polymorphism in the CCT‐domain gene (CCT marker) is also added. It is clear the indel locus is not in strong LD with surrounding markers. (b) Principal components analysis of the accessions studied here for protein content. First and third principal components were calculated from whole‐genome SoySNP50k array marker information for each accession and plotted, and the points representing accessions colored by the genotype at the four single nucleotide polymorphisms (SNPs) in (a) in LD with the Glyma.20G085100 locus plus the Glyma.20G085100 indel marker. Accessions with the PI 468916 sequence across the locus are denoted in red and Williams 82 sequence in blue. Lines were also detected that were homozygous or heterozygous for the PI 468916 version of the indel in Glyma.20G085100, but carried the flanking haplotype of markers identical to Williams 82, denoted in light blue or green. Broad distribution of green points indicates likely reversion by transposon excision.
Figure 3
Figure 3
Sequence of the insertion/deletion polymorphism. The 321‐bp indel polymorphism in the Glyma.20g085100 gene responsible for the cqSeed protein‐003 quantitative trait loci. (a) A 321‐bp sequence with transposon similarity is present in the Williams 82 genome and not the genome of the high‐protein accession PI 468916. (b) Predicted impact of the indel shown in A on the intron–exon structure of the Glyma.20g085100 transcript. Internal coding exons are shown as gray blocks (Exon 1 is identical and not shown), terminal exon as a blue block, and polyadenylation signal as a green diamond. Positions are in base pairs from the transcription start site, as in (A). The orange box shows the region of the insertion. (c) Multiple alignment of related CCT domain proteins. Published protein sequence of Glyma.20g085100 is shown (†) along with our annotation of the sequence from PI 468916 (‡) and a re‐annotated version of the Williams 82 sequence using the same methods as used for the PI 468916 sequence (§) aligned to related proteins in GenBank. Note that the sequence of PI 468916 is closely conserved with related proteins.

References

    1. American Soybean Association (2019) 2019 SoyStats Available at https://soygrowers.com/wp‐content/uploads/2019/10/Soy‐Stats‐2019_FNL‐Web... (Verified 1 June 2021).
    1. Bandillo, N. , Jarquin, D. , Song, Q. , Nelson, R. , Cregan, P. , Specht, J. et al. (2015) A population structure and genome‐wide association analysis on the USDA soybean germplasm collection. Plant Genome, 8, 1–13. - PubMed
    1. Bolon, Y.T. , Joseph, B. , Cannon, S.B. , Graham, M.A. , Diers, B.W. , Farmer, A.D. et al. (2010) Complementary genetic and genomic approaches help characterize the linkage group I seed protein QTL. BMC Plant Biology, 10, 41. - PMC - PubMed
    1. Brummer, E.C. , Graef, G.L. , Orf, J. , Wilcox, J.R. & Shoemaker, R.C. (1997) Mapping QTL for seed protein and oil content in eight soybean populations. Crop Science, 37, 370–378.
    1. Brzostowski, L.F. , Pruski, T.I. , Specht, J.E. & Diers, B.W. (2017) Impact of seed protein alleles from three soybean sources on seed composition and agronomic traits. Theoretical and Applied Genetics, 130, 2315–2326. - PubMed