Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Nov 5:2023.11.03.565574.
doi: 10.1101/2023.11.03.565574.

Biobank-scale inference of multi-individual identity by descent and gene conversion

Affiliations

Biobank-scale inference of multi-individual identity by descent and gene conversion

Sharon R Browning et al. bioRxiv. .

Update in

Abstract

We present a method for efficiently identifying clusters of identical-by-descent haplotypes in biobank-scale sequence data. Our multi-individual approach enables much more efficient collection and storage of identity by descent (IBD) information than approaches that detect and store pairwise IBD segments. Our method's computation time, memory requirements, and output size scale linearly with the number of individuals in the dataset. We also present a method for using multi-individual IBD to detect alleles changed by gene conversion. Application of our methods to the autosomal sequence data for 125,361 White British individuals in the UK Biobank detects more than 9 million converted alleles. This is 2900 times more alleles changed by gene conversion than were detected in a previous analysis of familial data. We estimate that more than 250,000 sequenced probands and a much larger number of additional genomes from multi-generational family members would be required to find a similar number of alleles changed by gene conversion using a family-based approach.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. Transitivity of IBD.
A. The coalescent tree relationship at a given point in the genome is shown for haplotypes h1,h2, and h3 which are mutually IBD at this location. Haplotypes h1 and h2 have common ancestor X, while haplotypes h1 and h3, and haplotypes h2 and h3, have common ancestor Y. B. Pairwise IBD status (black for IBD, white for non-IBD) is shown for the three pairs of haplotypes along a region of the chromosome around the focal position (denoted *). The IBD extends to either side of the focal point until reaching a point of recombination on one of the ancestral lineages. Although the IBD sharing between haplotypes h1 and h2, and between haplotypes h2 and h3, is long and may exceed a pre-defined length threshold, the IBD between haplotypes h1 and h3 is relatively short and may not meet the length threshold for pairwise IBD sharing.
Figure 2:
Figure 2:. IBD transitivity with and without trimming IBS segments.
IBD and IBS in a genomic region is shown for the three pairings of three haplotypes (haplotypes h1,h2, and h3). A. The IBD between haplotypes h1 and h2 is derived from a different recent common ancestor than that for the IBD between haplotypes h2 and h3. IBS that is not due to the recent common ancestors is incorrectly called as IBD at the ends of the IBD segments. As a result, transitivity leads to a region of IBS being incorrectly called as IBD between haplotypes h1 and h3. B. A trim is applied to the ends of the pairwise IBS regions and no IBD is called between haplotypes h1 and h3.
Figure 3:
Figure 3:. IBD cluster sizes in the UK Biobank White British autosomal sequence data.
Cluster size is shown on the x-axis for cluster sizes of ≤ 3 in the left panel and ≥ 3 in the right panel. The y-axis shows the proportion of haplotypes that are in IBD clusters having that size.

References

    1. Gusev A., Lowe J.K., Stoffel M., Daly M.J., Altshuler D., Breslow J.L., Friedman J.M., and Pe’er I. (2009). Whole population, genome-wide mapping of hidden relatedness. Genome Res 19, 318–326. - PMC - PubMed
    1. Browning S.R., and Browning B.L. (2012). Identity by descent between distant relatives: detection and applications. Annual Review of Genetics 46, 617–633. - PubMed
    1. Sticca E.L., Belbin G.M., and Gignoux C.R. (2021). Current developments in detection of identity-by-descent methods and applications. Frontiers in Genetics 12, 722602. - PMC - PubMed
    1. Te Meerman G.J., Van Der Meulen M.A., and Sandkuijl L.A. (1995). Perspectives of identity by descent (IBD) mapping in founder populations. Clinical & Experimental Allergy 25, 97–102. - PubMed
    1. Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M.A.R., Bender D., Maller J., Sklar P., de Bakker P.I.W., Daly M.J., et al. (2007). PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics 81, 559–575. - PMC - PubMed

Publication types