Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 5;103(1):30-44.
doi: 10.1016/j.ajhg.2018.05.008. Epub 2018 Jun 21.

Inferring Identical-by-Descent Sharing of Sample Ancestors Promotes High-Resolution Relative Detection

Affiliations

Inferring Identical-by-Descent Sharing of Sample Ancestors Promotes High-Resolution Relative Detection

Monica D Ramstetter et al. Am J Hum Genet. .

Abstract

As genetic datasets increase in size, the fraction of samples with one or more close relatives grows rapidly, resulting in sets of mutually related individuals. We present DRUID-deep relatedness utilizing identity by descent-a method that works by inferring the identical-by-descent (IBD) sharing profile of an ungenotyped ancestor of a set of close relatives. Using this IBD profile, DRUID infers relatedness between unobserved ancestors and more distant relatives, thereby combining information from multiple samples to remove one or more generations between the deep relationships to be identified. DRUID constructs sets of close relatives by detecting full siblings and also uses an approach to identify the aunts/uncles of two or more siblings, recovering 92.2% of real aunts/uncles with zero false positives. In real and simulated data, DRUID correctly infers up to 10.5% more relatives than PADRE when using data from two sets of distantly related siblings, and 10.7%-31.3% more relatives given two sets of siblings and their aunts/uncles. DRUID frequently infers relationships either correctly or within one degree of the truth, with PADRE classifying 43.3%-58.3% of tenth degree relatives in this way compared to 79.6%-96.7% using DRUID.

Keywords: cryptic relatedness; identical by descent; pedigree reconstruction; relationship inference.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pictorial Depiction of DRUID’s Relatedness and Aunt/Uncle Inference Approaches (A) Genotyped individuals are shown as filled shapes and haplotypes are colored vertical bars below analyzed samples; the dashed line indicates the number of generations to the most recent common ancestors between the full siblings and the distant relative on the right is unknown. The blue regions in the full siblings represent IBD segments shared with the distant relative on the right. DRUID infers the ungenotyped mother’s IBD profile as the union of her children’s IBD segments. (B) Example haplotype transmissions from grandparents to two full sibling grandchildren (bottom generation) as well as the siblings’ father and aunt, with regions of the same color descended from the corresponding grandparent chromosome (we ignore the gray chromosomes for simplicity). Given genotype data for the two siblings and their aunt, the IBD(011) regions are those where the two siblings are (1) IBD0 with each other, and (2) both are IBD with the aunt, indicated by green boxes. These are positions where the siblings’ ungenotyped father is IBD2 with the aunt.
Figure 2
Figure 2
Using SAMAFS Data, IBD2 Proportions between Pairs of Second Degree Relatives, and IBD(011) Lengths between Two Siblings and One of Their Second Degree Relatives (A) Scaled density showing genome proportions shared IBD2 between SAMAFS second degree relative pairs. Abbreviations: AU, aunt/uncle of a given sample; DC, double cousins; GP, grandparent of a given sample; and HS, half-siblings. (B) Length of genome found to be IBD(011) between two full siblings and various types of second degree relatives from SAMAFS. Abbreviations for the relationship between the siblings and the indicated relative as in (A). Double cousins are filtered based on their IBD2 proportion and therefore not shown. Plot colors are translucent with most relationship types overlapping one another near 0 in both panels.
Figure 3
Figure 3
Exact and Within-One-Degree Relatedness Inference Rates using Simulated Type 2 Pedigree Data (A) Average percent of distantly related sample pairs from the two sibling sets that are inferred as their true degree of relatedness using Refined IBD, PADRE, and DRUID. Raw averages are listed above each bar. Rows of bar plots have the same number of siblings included in both sibling sets, indicated as |S| (left). Columns show results for different degrees of relatedness, with the true degree listed above. (B) As in (A), but shows the average percent of distant relatives inferred to be related as the true degree D or as D ± 1. Error bars are bootstrapped (over complete relative sets) 95% confidence intervals.
Figure 4
Figure 4
Exact and Within-One-Degree Relatedness Inference Rates using Simulated Type 4 Pedigree Data (A) Average percent of distantly related sample pairs from the two sibling sets (bottom generation in Figure S4) that are inferred as their true degree of relatedness using Refined IBD, PADRE, and DRUID. Raw averages are listed above each bar. Rows of bar plots have the same number of siblings included in both sibling sets, indicated as |S| (left). Analyses with different numbers of aunts/uncles |A| included are shown in distinct bars as labeled. Columns show results for different degrees of relatedness between the two sibling sets, with the true degree listed above. Analyses with |A|=0 parallel those for type 2 pedigrees and use siblings only from the type 4 pedigree simulations. (B) As in (A), but shows the average percent of distantly related siblings inferred to be related as the true degree D or as D ± 1. Error bars are bootstrapped (over complete relative sets) 95% confidence intervals.
Figure 5
Figure 5
Exact and Within-One-Degree Relatedness Inference Rates using Real SAMAFS Data (A) Rates of DRUID and PADRE inferring a range of degrees of relatedness between two full sibling sets in SAMAFS that are distantly related to each other. Raw average inference rates are listed above each bar. Analyses consider inclusion of two full sibling sets only (A=0), two sibling sets and one associated aunt/uncle set (A=1), and two sibling sets that each have an aunt/uncle set (A=2), as indicated by bar labels. Number n of analyzed collections of two close relatives sets shown above. (B) As in (A), but shows the average percent of distant relatives inferred to be related as the true degree D or as D ± 1. Error bars are bootstrapped (over complete relative sets) 95% confidence intervals.

References

    1. Wakeley J., King L., Low B.S., Ramachandran S. Gene genealogies within a fixed pedigree, and the robustness of Kingman’s coalescent. Genetics. 2012;190:1433–1445. - PMC - PubMed
    1. Bycroft C., Freeman C., Petkova D., Band G., Elliott L.T., Sharp K., Motyer A., Vukcevic D., Delaneau O., O’Connell J. Genome-wide genetic data on 500,000 UK Biobank participants. bioRxiv. 2017
    1. Lek M., Karczewski K.J., Minikel E.V., Samocha K.E., Banks E., Fennell T., O’Donnell-Luria A.H., Ware J.S., Hill A.J., Cummings B.B., Exome Aggregation Consortium Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–291. - PMC - PubMed
    1. Dewey F.E., Murray M.F., Overton J.D., Habegger L., Leader J.B., Fetterolf S.N., O’Dushlaine C., Van Hout C.V., Staples J., Gonzaga-Jauregui C. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science. 2016;354:aaf6814. - PubMed
    1. Staples J., Maxwell E.K., Gosalia N., Gonzaga-Jauregui C., Snyder C., Hawes A., Penn J., Ulloa R., Bai X., Lopez A.E. Profiling and leveraging relatedness in a precision medicine cohort of 92,455 exomes. Am. J. Hum. Genet. 2018;102:874–889. - PMC - PubMed

Publication types