Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jul 5:14:1196134.
doi: 10.3389/fpls.2023.1196134. eCollection 2023.

Genome-wide genotyping data renew knowledge on genetic diversity of a worldwide alfalfa collection and give insights on genetic control of phenology traits

Affiliations

Genome-wide genotyping data renew knowledge on genetic diversity of a worldwide alfalfa collection and give insights on genetic control of phenology traits

Marie Pégard et al. Front Plant Sci. .

Abstract

China's and Europe's dependence on imported protein is a threat to the food self-sufficiency of these regions. It could be solved by growing more legumes, including alfalfa that is the highest protein producer under temperate climate. To create productive and high-value varieties, the use of large genetic diversity combined with genomic evaluation could improve current breeding programs. To study alfalfa diversity, we have used a set of 395 alfalfa accessions (i.e. populations), mainly from Europe, North and South America and China, with fall dormancy ranging from 3 to 7 on a scale of 11. Five breeders provided materials (617 accessions) that were compared to the 400 accessions. All accessions were genotyped using Genotyping-by-Sequencing (GBS) to obtain SNP allele frequency. These genomic data were used to describe genetic diversity and identify genetic groups. The accessions were phenotyped for phenology traits (fall dormancy and flowering date) at two locations (Lusignan in France, Novi Sad in Serbia) from 2018 to 2021. The QTL were detected by a Multi-Locus Mixed Model (mlmm). Subsequently, the quality of the genomic prediction for each trait was assessed. Cross-validation was used to assess the quality of prediction by testing GBLUP, Bayesian Ridge Regression (BRR), and Bayesian Lasso methods. A genetic structure with seven groups was found. Most of these groups were related to the geographical origin of the accessions and showed that European and American material is genetically distinct from Chinese material. Several QTL associated with fall dormancy were found and most of these were linked to genes. In our study, the infinitesimal methods showed a higher prediction quality than the Bayesian Lasso, and the genomic prediction achieved high (>0.75) predicting abilities in some cases. Our results are encouraging for alfalfa breeding by showing that it is possible to achieve high genomic prediction quality.

Keywords: GWAS; alfalfa; genetic diversity; genomic prediction; phenology.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Genotyping quality and linkage disequilibrium observed among the accessions. (A) Histogram of the percentage of missing values per accession, the vertical red line represents the applied threshold of 80% of missing values per accession. (B) Number of SNPs available depending on the percentage of missing values allowed per SNP. The horizontal lines are the thresholds (1%, 5%, 20% and 50%), the vertical bars represent the number of SNPs obtained with the corresponding threshold. The number in the right part of the graph indicates the number of SNPs from 0% of missing value to 50% of missing values. (C) SNP density per chromosome along the genome, estimated by the number of markers in a window of 500 kb on the 227 092 SNPs obtained with a threshold of 5% missing value per SNP. (D) Linkage disequilibrium (LD) between the 227 092 SNPs, estimated with a squared partial correlation. The LD was plotted for SNP distances of less than 20 000 bp. The purple color scale represents the point density, black for a low density and yellow for the highest density.
Figure 2
Figure 2
Clustering of the cultivated material based on a Principal Component Analysis performed on a genotyping dataset of 89 216 SNPs without missing data. The analysis was done on 395 accessions, and a Discriminant Analysis of Principal Components (DAPC) analysis revealed seven groups (ellipses) linked to the geographic region of the origin or registration.
Figure 3
Figure 3
Projection of the 617 European breeders’ accessions on the seven groups (ellipses) obtained from a DAPC analysis using a genotyping dataset of 89 216 SNPs without missing data. A particular point shape represents each breeder.
Figure 4
Figure 4
Principal Component Analysis (PCA) based on the phenotypic values, the accessions are colored per DAPC group. The traits are related to Flowering date (FD) and autumn dormancy depending on different measurements in autumn: Dormancy (D), Dry Matter Yield (F-DMY), plant height (PH), Speed of elongation (SE) for two years: 2019 (X19.X) and 2020 (X20.X) in two locations: Lusignan (.L) in France and Novi Sad (.N) in Serbia. F-DMY without letter or number is the Dry Matter Yield adjusted for year and location effects. The biggest points represent the centroids of each cluster.
Figure 5
Figure 5
Boxplot of the phenotypic values per group for all the traits related to Flowering date (FD) and autumn dormancy scored from different measurements in autumn. Dormancy (D), Dry Matter Yield (F-DMY), plant height (PH), Speed of elongation (SE) for two years: 2019 (X19.X) and 2020 (X20.X) in two locations: Lusignan (.L) in France and Novi Sad (.N) in Serbia. F-DMY without letter or number is the Dry Matter Yield measured in autumn adjusted for year and location effects.
Figure 6
Figure 6
Impact of the model on the predicting ability (y-axis) for phenology traits. Three models were tested: best linear unbiased prediction model (GBLUP), Bayesian Ridge-regression (BRR) and Bayesian Lasso (lasso). The error bars are the standard deviation estimated on 10 repetitions. Traits were: Dormancy (D), Dry Matter Yield (F-DMY), plant height (PH), Speed of elongation (SE) for two years: 2019 (X19.X) and 2020 (X20.X) in two locations: Lusignan (.L) in France and Novi Sad (.N) in Serbia and Flowering date (FD). F-DMY without letter or number is the Dry Matter Yield adjusted for year and location effects.
Figure 7
Figure 7
Impact of the training population size on the predicting ability for five traits (Flowering date (FD), Dormancy (D), Dry Matter Yield (F-DMY), plant height (PH), Speed of elongation (SE)) recorded in Lusignan in 2019. On the x-axis, the number of accessions used to train the model to predict a validation population composed of 100 accessions. The accessions were randomly taken among all the clusters. The average predicting ability (ten repetitions) estimated with the spearman correlation between the phenotype and its prediction is the middle solid line, the standard deviation is represented by the two solid lines above and below the middle line and colored by trait.
Figure 8
Figure 8
Impact of the training population composition on the predicting ability. The y-axis represents the difference of predicting ability between the predicting ability obtained when no other accessions from the same group were selected in the training population and the predicting ability when some accessions from the same group were in the training population. A negative predicting ability means that the predicting ability obtained with accessions of the same group in the training population was greater than the predicting ability when no accession of the same group was in the training population. The error bars are the standard deviation estimated on 10 repetitions. The traits are represented in the x-axis: Flowering date (FD), Dormancy (D), Dry Matter Yield (F-DMY), plant height (PH), Speed of elongation (SE) for two years: 2019 (X19.X) and 2020 (X20.X) in two locations: Lusignan (.L) in France and Novi Sad (.N) in Serbia. F-DMY without letter or number is the Dry Matter Yield adjusted for year and location effects.

Similar articles

Cited by

References

    1. Adhikari L., Lindstrom O. M., Markham J., Missaoui A. M. (2018)QTLs associated with fall dormancy & winter-hardiness in alfalfa shows potential for independent improvement of the two traits. In: ASA-CSSA-SSSA. Available at: https://scisoc.confex.com/scisoc/2018am/meetingapp.cgi/Paper/112970 (Accessed 26, 2022).
    1. Akaike H. (1974). A new look at the statistical model identification. IEEE transactions on automatic control. 19 (6), 716–723.
    1. Akdemir D., Isidro-Sánchez J. (2019). Design of training populations for selective phenotyping in genomic prediction. Sci. Rep. 9, 1446. doi: 10.1038/s41598-018-38081-6 - DOI - PMC - PubMed
    1. Akdemir D., Sanchez J. I., Jannink J.-L. (2015). Optimization of genomic selection training populations with a genetic algorithm. Genet. Sel Evol. 47, 38. doi: 10.1186/s12711-015-0116-6 - DOI - PMC - PubMed
    1. Andrade M., Acharya J., Benevenuto J., Oliveira I., Lopez Y., Munoz P., et al. . (2022). Genomic prediction for canopy height and dry matter yield in alfalfa using family bulks. Plant Genome 15, 1–16. doi: 10.1002/tpg2.20235 - DOI - PubMed

LinkOut - more resources