Genomic prediction with kinship-based multiple kernel learning produces hypothesis on the underlying inheritance mechanisms of phenotypic traits
- PMID: 40181452
- PMCID: PMC11969835
- DOI: 10.1186/s13059-025-03544-3
Genomic prediction with kinship-based multiple kernel learning produces hypothesis on the underlying inheritance mechanisms of phenotypic traits
Abstract
Background: Genomic prediction encompasses the techniques used in agricultural technology to predict the genetic merit of individuals towards valuable phenotypic traits. It is related to Genome Interpretation in humans, which models the individual risk of developing disease traits. Genomic prediction is dominated by linear mixed models, such as the Genomic Best Linear Unbiased Prediction (GBLUP), which computes kinship matrices from SNP array data, while Genome Interpretation applications to clinical genetics rely mainly on Polygenic Risk Scores.
Results: In this article, we exploit the positive semidefinite characteristics of the kinship matrices that are conventionally used in GBLUP to propose a novel Genomic Multiple Kernel Learning method (GMKL), in which the multiple kinship matrices corresponding to Additive, Dominant, and Epistatic Inheritance Mechanisms are used as kernels in support vector machines, and we apply it to both worlds. We benchmark GMKL on simulated cattle phenotypes, showing that it outperforms the classical GBLUP predictors for genomic prediction. Moreover, we show that GMKL ranks the kinship kernels representing different inheritance mechanisms according to their compatibility with the observed data, allowing it to produce hypotheses on the normally unknown inheritance mechanisms generating the target phenotypes. We then apply GMKL to the prediction of two inflammatory bowel disease cohorts with more than 6500 samples in total, consistently obtaining results suggesting that epistasis might have a relevant, although underestimated role in inflammatory bowel disease (IBD).
Conclusions: We show that GMKL performs similarly to GBLUP, but it can formulate biological hypotheses about inheritance mechanisms, such as suggesting that epistasis influences IBD.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Ethics approval and consent to participate: Concerning the IBDSNP dataset, IBD cases and non-IBD controls from the in-house dataset were collected as part of the IBD genetics study (CCare), initiated by the IBD unit at University Hospitals Leuven. All participants provided written informed consent, and the study received ethical approval from the Ethics Committee Research UZ/KU Leuven (S53684). Samples and data are stored in a coded, anonymized biobank and database. The IBDWES dataset is available from dbGaP (Inflammatory Bowel Disease Exome Sequencing Study, dbGaP Study Accession: phs001076.v1.p1). Access to the data can be requested through dbGaP. Consent for publication: All authors gave their consent to the publication. Competing interests: The authors declare no competing interests.
Figures
References
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
