Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 May 1;26(3):bbaf211.
doi: 10.1093/bib/bbaf211.

Advances in multi-trait genomic prediction approaches: classification, comparative analysis, and perspectives

Affiliations
Review

Advances in multi-trait genomic prediction approaches: classification, comparative analysis, and perspectives

Alain J Mbebi et al. Brief Bioinform. .

Abstract

Traits in any organism are not independent, but show considerable integration, observed in a form of couplings and trade-offs. Therefore, improvement in one trait may affect other traits, often in undesired direction. To account for this problem, crop breeding increasingly relies on multi-trait genomic prediction (MT-GP) approaches that leverage the availability of genetic markers from different populations along with advances in high-throughput precision phenotyping. While significant progress has been made to jointly model multiple traits using a variety of statistical and machine learning approaches, there is no systematic comparison of advantages and shortcomings of the existing classes of MT-GP models. Here, we fill this knowledge gap by first classifying the existing MT-GP models and briefly summarizing their general principles, modeling assumptions, and potential limitations. We then perform an extensive comparative analysis with 10 traits measured in an Oryza sativa diversity panel using cross-validation scenarios relevant in breeding practice. Finally, we discuss directions that can enable the building of next generation MT-GP models in addressing pressing challenges in crop breeding.

Keywords: breeding; crop improvement; deep learning; genomic prediction; machine learning; multi-trait.

PubMed Disclaimer

Conflict of interest statement

All authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1
Schematic overview of GS. Showcased are the main steps involved in the GS, starting with the collection of phenotypic and genotypic data from a training population (e.g. inbreeds or hybrids). Depending on the prediction objective and the sample size, different CV schemes along with collected data are used to train the predictive models; these models are subsequently used to determine GEBVs. The GEBVs are then applied to a testing population that is only phenotyped and from which individuals with desired performances are selected without the need for direct phenotyping. Briefly, in formula image-fold CV, the population under consideration is partitioned into formula image folds of approximately equal size; the model is trained on formula image folds while the remaining fold is used for validation until each fold has been used as a validation set. Leave one out CV is similar to the former except for the fact that a single individual is used for validation. On the other hand, CV0, CV00, CV1, and CV2 are employed under multiple environments settings and they correspond respectively to the prediction of seen genotypes in unseen environments, unseen genotypes in unseen environments, unseen genotypes in seen environments and genotypes seen in some environments to be predicted in other seen environments.
Figure 2
Figure 2
Comparison of predictabilities for MT and a baseline GP methods with a rice data set. We used five MT-GP models, namely: MT-BMORS, MT-MOR, MT-SVD, MT-PLS, and MT-DL, and ST-GBLUP to predict the levels of five metabolites (i.e. mr1198, mr1234, mr1246, mr1268, and mr1418; see Metabolic traits section for full description) as well as five yield-related traits (i.e. yield, GW, HD, PSSR, and PH). The predictability is computed as the average Pearson correlation coefficient between observed and predicted values for the ten traits in the validation set, based on 20 repetitions of 5- and 10-fold CV for respectively CV-A (a and b), CV-B ( d and e), and CV-C (c). The average accuracy obtained from repeated CVs are reported as the height of the bars along with the standard errors. Panels a and b correspond to the CV schemes in which models were trained on Indica and Japonica to predict traits in Indica and Japonica accessions, respectively. In contrast, panels d and e correspond respectively to the CV scenario where the models were trained on data from Indica (Japonica) and used to predict the performance on Japonica (Indica). Finally, panel c is concerned with the random split with varying proportion of combined Indica/Japonica samples to predict the remaining mixed samples of Indica and japonica.

Similar articles

Cited by

References

    1. Van Dijk, Morley T, Rau ML. et al. . A meta-analysis of projected global food demand and population at risk of hunger for the period 2010–2050. Nat Food 2021;2:494–501. 10.1038/s43016-021-00322-9 - DOI - PubMed
    1. Tester M, Langridge P. Breeding technologies to increase crop production in a changing world. Science 2010;327:818–22. 10.1126/science.1183700 - DOI - PubMed
    1. McCouch S, Baute GJ, Bradeen J. et al. . Feeding the future. Nature 2013;499:23–4. 10.1038/499023a - DOI - PubMed
    1. Dwivedi SL, Heslop-Harrison P, Amas J. et al. . Epistasis and pleiotropy-induced variation for plant breeding. Plant Biotechnol J 2024;22:2788–807. 10.1111/pbi.14405 - DOI - PMC - PubMed
    1. Mackay TFC, Anholt RRH. Pleiotropy, epistasis and the genetic architecture of quantitative traits. Nat Rev Genet 2024;25:639–57. 10.1038/s41576-024-00711-3 - DOI - PMC - PubMed

Grants and funding