This is a preprint.
Improved Identification of Large-effect Rare Genetic Variants using Haplotype Aggregated Allele-specific Expression Data
- PMID: 41445643
- PMCID: PMC12723776
- DOI: 10.64898/2025.12.16.25341855
Improved Identification of Large-effect Rare Genetic Variants using Haplotype Aggregated Allele-specific Expression Data
Abstract
Allele-specific expression (ASE) outlier detection is a powerful tool for identifying genes affected by large effect rare genetic regulatory variants but suffers from data sparsity and noisy signal in low-count genes. Genome phasing can be utilized to aggregate ASE signal along haplotypes to alleviate both sparsity and noise. Yet statistical tools for utilizing haplotype-level ASE data for rare variant interpretation are lacking. Here, we present ANEVA-h, to quantify the amount of genetic variation in gene expression from haplotype-level ASE data in a population, enabling more accurate and comprehensive detection of regulatory effects. We apply ANEVA-h to GTEx project data, along with a compatible dosage outlier test, to show an over 2-fold increase in the number of testable genes, reduction of spurious outlier calls, and improved enrichment for rare high-impact variants. In clinical cohorts of neuromuscular and congenital heart disease, it enhances gene prioritization and identifies candidate diagnoses missed by DROP-MAE and ANEVA. Finally, we analyze globally diverse populations to characterize the impact of ancestry background in reference and the test population. We provide tools and data necessary to facilitate integration of haplotype level ASE outlier testing in rare variant interpretation pipelines.
Conflict of interest statement
P.M. was supported by the National Institutes of Health under award number R01GM140287. T.L. is an advisor to and owns equity in Variant Bio. AT is a co-founder and equity share holder of GeneXwell Inc and an advisor to InsideTracker.
Figures
References
-
- Cleary S. and Seoighe C., Perspectives on allele-specific expression. Annual Review of Biomedical Data Science, 2021. 4: p. 101–122.
-
- Byron S.A., et al. , Translating RNA sequencing into clinical diagnostics: opportunities and challenges. Nature Reviews Genetics, 2016. 17(5): p. 257–271.
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources