Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Dec 5;20(12):e0335227.
doi: 10.1371/journal.pone.0335227. eCollection 2025.

Updated resource of 180K soybean SNP genotyping array based on the T2T reference genome

Affiliations

Updated resource of 180K soybean SNP genotyping array based on the T2T reference genome

Ji-Hun Hwang et al. PLoS One. .

Abstract

Single nucleotide polymorphism (SNP) genotyping has revolutionized crop improvement by enabling high-resolution genomic analyses and accelerating breeding programs. In soybean (Glycine max (L.) Merr.), a globally important legume crop, existing genotyping data for 180,961 SNP markers from the Korean soybean core collection were generated using the outdated Williams 82 reference genome version 1 (Wm82.v1), which contains numerous assembly gaps and misassemblies that limit genomic resolution. While high-quality reference genomes including Wm82.v4 and Wm82.v6 (telomere-to-telomere assembly) are now available, the valuable existing SNP array data have not been integrated with these improved genomic resources. Here we show successful remapping of the 180K SNP array data to both Wm82.v4 and Wm82.v6 reference genomes through sequence-based alignment of flanking regions. We extracted flanking sequences from SNP marker positions in Wm82.v1 and mapped them to the newer reference versions based on sequence similarity, excluding markers with mapping failures, allele mismatches, low-identity alignments, or multiple mappings, which resulted in the successful mapping of 175,202 and 175,763 markers to Wm82.v4 and Wm82.v6, respectively. We also remapped genotype data from 927 soybean accessions (497 USDA-GRIN accessions and 430 Korean core collection accessions) to both reference versions. This updated SNP dataset provides the soybean research community with a comprehensive genomic resource that leverages both existing genotyping investments and state-of-the-art reference genome assemblies for enhanced crop improvement and genomic studies.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Overview of SNP markers with liftover failures.
Fig 2
Fig 2. Venn diagram of the lifted SNP marker across the various versions of Wm82 genomes.
Fig 3
Fig 3. Comprehensive visualization of the gene distribution and lifted SNP marker landscape in Wm82.v6.
The inner green heatmap represents gene density along the chromosomes, while the adjacent orange histogram illustrates the density of lifted SNP markers.
Fig 4
Fig 4. Visualization of SNP array markers with liftover failures.
The plot is based on the Wm82.v1 genome, with scaffold marker information excluded. (a) Markers with liftover failure in all genome versions. (b) Markers with liftover failure only in Wm82.v4. (c) Markers with liftover failure only in Wm82.v6. (d) Gene density heatmap. (e) Density heatmap of Class I transposons in Wm82.v1. (f) Density heatmap of Class II transposons in Wm82.v1. (g) Genomic regions with <90% identity between Wm82.v1 and Wm82.v4. (h) Genomic regions with <90% identity between Wm82.v1 and Wm82.v6.
Fig 5
Fig 5. Visualization of genic regions for the liftover analysis comparing Wm82.v4 and Wm82.v6.
Each block on the track represents the position of the gene. Gray and pink lines between tracks indicate syntenic regions, while the red line on the Wm82.v6 track marks the position of the genic marker.
Fig 6
Fig 6. Visualization of intergenic marker regions for the liftover analysis comparing Wm82.v4 and Wm82.v6.
The semi-transparent green blocks indicate zoomed-in regions. Gray regions between tracks indicate syntenic regions. The red lines on the track indicate TE nested intergenic markers specific to Wm82.v6, while the blue lines represent those present in both Wm82.v4 and Wm82.v6.

References

    1. Zhang J, Song Q, Cregan PB, Jiang G-L. Genome-wide association study, genomic prediction and marker-assisted selection for seed weight in soybean (Glycine max). Theor Appl Genet. 2016;129(1):117–30. doi: 10.1007/s00122-015-2614-x - DOI - PMC - PubMed
    1. Wen L, Chang H-X, Brown PJ, Domier LL, Hartman GL. Genome-wide association and genomic prediction identifies soybean cyst nematode resistance in common bean including a syntenic region to soybean Rhg1 locus. Hortic Res. 2019;6:9. doi: 10.1038/s41438-018-0085-3 - DOI - PMC - PubMed
    1. Bhat JA, Yu D. High‐throughput NGS‐based genotyping and phenotyping: Role in genomics‐assisted breeding for soybean improvement. Legume Science. 2021;3(3). doi: 10.1002/leg3.81 - DOI
    1. Lee Y-G, Jeong N, Kim JH, Lee K, Kim KH, Pirani A, et al. Development, validation and genetic analysis of a large soybean SNP genotyping array. Plant J. 2015;81(4):625–36. doi: 10.1111/tpj.12755 - DOI - PubMed
    1. Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, et al. Genome sequence of the palaeopolyploid soybean. Nature. 2010;463(7278):178–83. doi: 10.1038/nature08670 - DOI - PubMed

LinkOut - more resources