Clade Distillation for Genome-wide Association Studies
- PMID: 40795253
- PMCID: PMC12667359
- DOI: 10.1093/genetics/iyaf158
Clade Distillation for Genome-wide Association Studies
Abstract
Testing inferred haplotype genealogies for association with phenotypes has been a longstanding goal in human genetics given their potential to detect association signals driven by allelic heterogeneity - when multiple causal variants modulate a phenotype - in both coding and noncoding regions. Recent scalable methods for inferring locus-specific genealogical trees along the genome, or representations thereof, have made substantial progress towards this goal; however, the problem of testing these trees for association with phenotypes has remained unsolved due to the growth in the number of clades with increasing sample size. To address this issue, we introduce several practical improvements to the kalis ancestry inference engine, including a general optimal checkpointing algorithm for decoding hidden Markov models, thereby enabling efficient genome-wide analyses. We then propose LOCATER, a powerful new procedure based on the recently proposed Stable Distillation framework, to test local tree representations for trait association. Although LOCATER is demonstrated here in conjunction with kalis, it may be used for testing output from any ancestry inference engine, regardless of whether such engines return discrete tree structures, relatedness matrices, or some combination of the two at each locus. Using simulated quantitative phenotypes, our results indicate that LOCATER achieves substantial power gains over traditional single marker testing, ARG-Needle, and window-based testing in cases of allelic heterogeneity, while also improving causal region localization. These findings suggest that genealogy-based association testing will be a fruitful approach for gene discovery, especially for signals driven by multiple ultra-rare variants.
Keywords: ancestral recombination graph; checkpointing; quadratic form; stable distillation.
© The Author(s) 2025. Published by Oxford University Press on behalf of The Genetics Society of America.
Conflict of interest statement
Conflicts of interest
None declared.
Figures
References
-
- Balkema AA, De Haan L. 1974. Residual life time at great age. Ann Probab. 2:792–804. 10.1214/aop/1176996548. - DOI
LinkOut - more resources
Full Text Sources
