Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 22;26(1):bbae624.
doi: 10.1093/bib/bbae624.

BICEP: Bayesian inference for rare genomic variant causality evaluation in pedigrees

Affiliations

BICEP: Bayesian inference for rare genomic variant causality evaluation in pedigrees

Cathal Ormond et al. Brief Bioinform. .

Abstract

Next-generation sequencing is widely applied to the investigation of pedigree data for gene discovery. However, identifying plausible disease-causing variants within a robust statistical framework is challenging. Here, we introduce BICEP: a Bayesian inference tool for rare variant causality evaluation in pedigree-based cohorts. BICEP calculates the posterior odds that a genomic variant is causal for a phenotype based on the variant cosegregation as well as a priori evidence such as deleteriousness and functional consequence. BICEP can correctly identify causal variants for phenotypes with both Mendelian and complex genetic architectures, outperforming existing methodologies. Additionally, BICEP can correctly down-weight common variants that are unlikely to be involved in phenotypic liability in the context of a pedigree, even if they have reasonable cosegregation patterns. The output metrics from BICEP allow for the quantitative comparison of variant causality within and across pedigrees, which is not possible with existing approaches.

Keywords: Bayes factor; Bayesian inference; next-generation sequencing; pedigree; posterior odds of causality; prior odds of causality.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic representation of BICEP. The prior odds of causality (PriorOC) are generated from genomic annotation information, independent of the pedigree data. The BF is generated based on the sequencing data, the pedigree structure, and the phenotypes. These are then considered on the base 10 logarithmic scale (logPriorOC and logBF) and summed to give the posterior odds of causality on the base 10 logarithmic scale (logPostOC). The dashed lines in the BF and logBF plots indicate the maximum achievable value for the pedigree (i.e. perfect cosegregation with the phenotype with complete penetrance and no phenocopies).
Figure 2
Figure 2
BICEP applied to the cardiac pedigree. The top 50 variants out of 32 010 ranked by the logPostOC (top panel) from BICEP in the congenital heart defect pedigree. The causal G296S missense variant in GATA4 is shaded and ranked first overall. Also shown for each variant are the logPriorOC (bottom panel) and the logBF (middle panel). The horizontal dashed line in the logBF plot represents the maximum achievable logBF in the pedigree.
Figure 3
Figure 3
BICEP applied to the three schizophrenia pedigrees. The output metrics for the top five variants ranked by the logPostOC from BICEP in each of the three schizophrenia pedigrees (K1494, K1524, K1546). The three pedigree-private variants previously prioritized using the IBS-filtering approach [33] are shaded in each pedigree. The horizontal dashed line represents the maximum achievable logBF in each pedigree.
Figure 4
Figure 4
BICEP and pVAAST applied to the CEPH 1463 pedigree with the synthetic phenotypes. Output scores for the pseudo-causal variants, including (A) the logPostOC from BICEP and (B) the gene-based composite likelihood ratio test score from pVAAST. For ease of presentation, the logPostOC scores were capped below at −2, indicated with an asterisk. Variants are coloured by their allele frequency (AF): Common (AF ≥ 5%), low frequency (1% ≤ AF < 5%), and rare (AF < 1%). Genes are ordered according to genomic position.

Similar articles

References

    1. Glahn DC, Nimgaonkar VL, Raventos H. et al. Rediscovering the value of families for psychiatric genetics research. Mol Psychiatry 2019;24:523–35. 10.1038/s41380-018-0073-x. - DOI - PMC - PubMed
    1. Thomas DC, Yang Z, Yang F. Two-phase and family-based designs for next-generation sequencing studies. Front Genet 2013;4:276. - PMC - PubMed
    1. Jiao X, Ke H, Qin Y. et al. Molecular genetics of premature ovarian insufficiency. Trends Endocrinol Metab 2018;29:795–807. 10.1016/j.tem.2018.07.002. - DOI - PubMed
    1. Similuk MN, Yan J, Ghosh R. et al. Clinical exome sequencing of 1000 families with complex immune phenotypes: toward comprehensive genomic evaluations. J Allergy Clin Immunol 2022;150:947–54. 10.1016/j.jaci.2022.06.009. - DOI - PMC - PubMed
    1. Kuhlen M, Taeubner J, Brozou T. et al. Family-based germline sequencing in children with cancer. Oncogene 2019;38:1367–80. 10.1038/s41388-018-0520-9. - DOI - PMC - PubMed