Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 4;26(1):84.
doi: 10.1186/s13059-025-03544-3.

Genomic prediction with kinship-based multiple kernel learning produces hypothesis on the underlying inheritance mechanisms of phenotypic traits

Affiliations

Genomic prediction with kinship-based multiple kernel learning produces hypothesis on the underlying inheritance mechanisms of phenotypic traits

Daniele Raimondi et al. Genome Biol. .

Abstract

Background: Genomic prediction encompasses the techniques used in agricultural technology to predict the genetic merit of individuals towards valuable phenotypic traits. It is related to Genome Interpretation in humans, which models the individual risk of developing disease traits. Genomic prediction is dominated by linear mixed models, such as the Genomic Best Linear Unbiased Prediction (GBLUP), which computes kinship matrices from SNP array data, while Genome Interpretation applications to clinical genetics rely mainly on Polygenic Risk Scores.

Results: In this article, we exploit the positive semidefinite characteristics of the kinship matrices that are conventionally used in GBLUP to propose a novel Genomic Multiple Kernel Learning method (GMKL), in which the multiple kinship matrices corresponding to Additive, Dominant, and Epistatic Inheritance Mechanisms are used as kernels in support vector machines, and we apply it to both worlds. We benchmark GMKL on simulated cattle phenotypes, showing that it outperforms the classical GBLUP predictors for genomic prediction. Moreover, we show that GMKL ranks the kinship kernels representing different inheritance mechanisms according to their compatibility with the observed data, allowing it to produce hypotheses on the normally unknown inheritance mechanisms generating the target phenotypes. We then apply GMKL to the prediction of two inflammatory bowel disease cohorts with more than 6500 samples in total, consistently obtaining results suggesting that epistasis might have a relevant, although underestimated role in inflammatory bowel disease (IBD).

Conclusions: We show that GMKL performs similarly to GBLUP, but it can formulate biological hypotheses about inheritance mechanisms, such as suggesting that epistasis influences IBD.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Concerning the IBDSNP dataset, IBD cases and non-IBD controls from the in-house dataset were collected as part of the IBD genetics study (CCare), initiated by the IBD unit at University Hospitals Leuven. All participants provided written informed consent, and the study received ethical approval from the Ethics Committee Research UZ/KU Leuven (S53684). Samples and data are stored in a coded, anonymized biobank and database. The IBDWES dataset is available from dbGaP (Inflammatory Bowel Disease Exome Sequencing Study, dbGaP Study Accession: phs001076.v1.p1). Access to the data can be requested through dbGaP. Consent for publication: All authors gave their consent to the publication. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Figure showing the comparison between the 9 methods we benchmarked on the CATTLE dataset. The MEAN, FH, and CKA GMKL approaches using 1 to 5 kernel matrices are respectively shown in shades of green, red and blue. The gray bars show the additive GBLUP model (light) and the optimal GBLUP model (dark), which always use the optimal set of kinship matrices. The phenotypes ranging from zero to four involve only A and D effects, with Pheno:0 being 100% additive, Pheno:2 being 50/50%, and Pheno:4 being purely Dominant. Phenotypes 5–9 include also epistatic effects. They are composed by a base 33% A and D components, plus a 34% epistatic component that is additive-additive (Pheno:5), additive-dominant (Pheno:6), and dominant-dominant (Pheno:7). Pheno:8 contains a mixture of all effects
Fig. 2
Fig. 2
Figure showing the balanced accuracy (BAC), area under the ROC curve (AUC), and area under the precision-recall curve (AUPRC) for the benchmark of our GMKL approaches with GBLUP and other ML methods on the IBDSNP dataset
Fig. 3
Fig. 3
Figure showing the comparison between the nine methods we benchmarked on the IBDWES dataset. The MEAN, FH, and CKA GMKL approaches using one to five kernel matrices are respectively shown in shades of green, red, and blue. The black to gray bars show the additive GBLUP models, while the bottom bars show the performance obtained by the Neural Networks methods proposed in [50]

Similar articles

References

    1. Raimondi D, Corso M, Fariselli P, Moreau Y. From genotype to phenotype in Arabidopsis thaliana: in-silico genome interpretation predicts 288 phenotypes from sequencing data. Nucleic Acids Res. 2022;50(3):e16–e16. - PMC - PubMed
    1. Raimondi D, Orlando G, Verplaetse N, Fariselli P, Moreau Y. Towards genome interpretation: Computational methods to model the genotype-phenotype relationship. Front Bioinforma. 2022;2:1098941. - PMC - PubMed
    1. CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods. Genome Biol. 2024;25(1):53. - PMC - PubMed
    1. Andreoletti G, Pal LR, Moult J, Brenner SE. Reports from the fifth edition of CAGI: The Critical Assessment of Genome Interpretation. Hum Mutat. 2019;40(9):1197–201. - PMC - PubMed
    1. Daneshjou R, Wang Y, Bromberg Y, Bovo S, Martelli PL, Babbi G, et al. Working toward precision medicine: Predicting phenotypes from exomes in the Critical Assessment of Genome Interpretation (CAGI) challenges. Hum Mutat. 2017;38(9):1182–92. - PMC - PubMed

LinkOut - more resources