Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 30;14(1):17588.
doi: 10.1038/s41598-024-67790-4.

Analyzing Medicago spp. seed morphology using GWAS and machine learning

Affiliations

Analyzing Medicago spp. seed morphology using GWAS and machine learning

Jacob Botkin et al. Sci Rep. .

Abstract

Alfalfa is widely recognized as an important forage crop. To understand the morphological characteristics and genetic basis of seed morphology in alfalfa, we screened 318 Medicago spp., including 244 Medicago sativa subsp. sativa (alfalfa) and 23 other Medicago spp., for seed area size, length, width, length-to-width ratio, perimeter, circularity, the distance between the intersection of length & width (IS) and center of gravity (CG), and seed darkness & red-green-blue (RGB) intensities. The results revealed phenotypic diversity and correlations among the tested accessions. Based on the phenotypic data of M. sativa subsp. sativa, a genome-wide association study (GWAS) was conducted using single nucleotide polymorphisms (SNPs) called against the Medicago truncatula genome. Genes in proximity to associated markers were detected, including CPR1, MON1, a PPR protein, and Wun1(threshold of 1E-04). Machine learning models were utilized to validate GWAS, and identify additional marker-trait associations for potentially complex traits. Marker S7_33375673, upstream of Wun1, was the most important predictor variable for red color intensity and highly important for brightness. Fifty-two markers were identified in coding regions. Along with strong correlations observed between seed morphology traits, these genes will facilitate the process of understanding the genetic basis of seed morphology in Medicago spp.

Keywords: Medicago sativa; Alfalfa; Area size; GWAS; Machine learning; RGB; Seed color; Seed morphology.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Scatter plots of seed area, circularity, and brightness across Medicago spp. (a) M. scutellata has the largest seed area, followed by M. ciliaris, M. arborea, and M. bonarotiana. Conversely, M. sativa subsp. caerulea and M. monspeliaca exhibit the smallest seed area across tested Medicago spp. (b) M. orbicularis displays the most circular shape followed by M. cancellata and M. sativa subsp. falcata. In contrast, M. truncatula, M. murex, and M. littoralis have the least circular shapes. (c) PI 619,434 (M. sativa subsp. sativa) is the brightest accession across all Medicago spp., followed by M. truncatula, M. littoralis, M. arabica, and other M. sativa subsp. sativa accessions. M. ciliaris and M. muricoleptis present the darkest seeds.
Figure 2
Figure 2
Comparison of seed area sizes for accessions PI 197,356, PI 287,999, PI 386,287, and PI 604,218. (a) PI 197,356 (Medicago scutellata) has the largest seed area size among all evaluated Medicago spp. (b) PI 287,999 (Medicago monspeliaca) exhibits the smallest seed area size of all evaluated Medicago spp. (c) PI 386,287 has the largest seeds in area size in Medicago sativa subsp. sativa. (d) PI 604,218 has the smallest seeds in area size in Medicago sativa subsp. sativa. The scale bars indicate 1 cm.
Figure 3
Figure 3
Comparison of seed brightness for PI 660,361, PI 498,767, PI 619,434, and PI 468,014. (a) PI 660,361 (Medicago truncatula) has the second brightest seeds across all evaluated Medicago spp. accessions (b) PI 498,767 (Medicago ciliaris) possesses the darkest seeds in the population of Medicago spp. (c) PI 619,434 has the brightest seeds in all evaluated Medicago spp. (d) PI 468,014 has the darkest seeds in Medicago sativa subsp. sativa. The scale bars indicate 1 cm.
Figure 4
Figure 4
Scatter plots, histograms, and heatmaps reveal correlations among seed morphology traits based on Pearson’s r.
Figure 5
Figure 5
Biplot of the PCA of seed morphological traits in Medicago spp. PC1 and PC2 are displayed, representing the two principal components that explain 81% of the variance in the data.
Figure 6
Figure 6
Partial contribution of seed morphology-related traits to principal components in Medicago spp. The chart displays the partial contribution of each seed morphology-related trait to the first three principal components (PC1, PC2, and PC3).
Figure 7
Figure 7
Pearson correlation of predictor variable importance for trained machine learning models. (a) The relative importance of 8377 predictor variables (markers) was calculated for 11 phenotypic traits using three machine learning models: random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB). (b) Clusters of highly correlated traits with correlation values. Pearson correlation scores were colored with red indicating 1 and blue indicating − 1.

Similar articles

Cited by

References

    1. Chastain, T. G., Ward, K. J. & Wysocki, D. J. Stand establishment response of soft white winter wheat to seedbed residue and seed size. Crop Sci.35, 213–218 (1995).10.2135/cropsci1995.0011183X003500010040x - DOI
    1. Boukail, S. et al. Genome wide association study of agronomic and seed traits in a world collection of proso millet (Panicum miliaceum L.). BMC Plant Biol.21, 330 (2021). 10.1186/s12870-021-03111-5 - DOI - PMC - PubMed
    1. United States Department of Agriculture. USDAhttps://www.nass.usda.gov/Statistics_by_Subject/result.php?32C485CC-791F... (2022).
    1. Veronesi, F., Brummer, E. C. & Huyghe, C. Alfalfa. In Fodder Crops and Amenity Grasses (eds Boller, B. et al.) 395–437 (Springer, 2010).
    1. Teuber, L. R. & Brick, M. A. Morphology and Anatomy. In Alfalfa and Alfalfa Improvement (eds Hanson, A. A. et al.) 125–162 (American Society of Agronomy, 1988).

LinkOut - more resources