Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Sep;1(3):e32.
doi: 10.1371/journal.pgen.0010032.

Confounding from cryptic relatedness in case-control association studies

Affiliations

Confounding from cryptic relatedness in case-control association studies

Benjamin F Voight et al. PLoS Genet. 2005 Sep.

Abstract

Case-control association studies are widely used in the search for genetic variants that contribute to human diseases. It has long been known that such studies may suffer from high rates of false positives if there is unrecognized population structure. It is perhaps less widely appreciated that so-called "cryptic relatedness" (i.e., kinship among the cases or controls that is not known to the investigator) might also potentially inflate the false positive rate. Until now there has been little work to assess how serious this problem is likely to be in practice. In this paper, we develop a formal model of cryptic relatedness, and study its impact on association studies. We provide simple expressions that predict the extent of confounding due to cryptic relatedness. Surprisingly, these expressions are functions of directly observable parameters. Our analytical results show that, for well-designed studies in outbred populations, the degree of confounding due to cryptic relatedness will usually be negligible. However, in contrast, studies where there is a sampling bias toward collecting relatives may indeed suffer from excessive rates of false positives. Furthermore, cryptic relatedness may be a serious concern in founder populations that have grown rapidly and recently from a small size. As an example, we analyze the impact of excess relatedness among cases for six phenotypes measured in the Hutterite population.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Coalescence Rates for Pairs of Random Chromosomes (Red) and for Pairs of Chromosomes from Affected Individuals (Green)
Notice that chromosomes from affected individuals have a small excess probability of coalescing very rapidly (i.e., in the most recent ten generations or so). Otherwise, their coalescence rates are essentially like those of random chromosomes. The region at the left-hand side of the graph between the red and green lines represents the excess probability of very recent coalescence among case chromosomes (denoted R in the text). This is what gives rise to the effect of cryptic relatedness. For larger t, the line for cases drops slightly below the line for random individuals, since both distributions integrate to 1. These plots assume an additive genetic model, with λs = 60, the “half”-relationships mating model, and a population size of 2,000. The line for cases was generated under the approximation that the excess relatedness is completely limited to the first n = 10 generations. In this case, the maximum coalescent probability for case chromosomes is 0.00275, when t = 1; R ≈ 0.00334. As expected, the mean coalescence time is ≈ 4,000 generations for both distributions. Alterations in n yield similar results (unpublished data).
Figure 2
Figure 2. Cumulative Probability of Coalescence within the Last n Meioses in the Hutterite Founder Population
Each line plots the estimated probability that two chromosomes drawn at random, from different individuals affected with a given phenotype, or from two random control individuals, descend from a single ancestral chromosome within the last n meioses. These estimates are based on the recorded Hutterite genealogy. The x-axis plots the average number of meioses along the two lineages back to the common ancestor. Notice that in the most recent generations, the case samples coalesce at higher rates than do random controls.

References

    1. Risch NJ. Searching for genetic determinants in the new millennium. Nature. 2000;405:847–856. - PubMed
    1. Knowler WC, Williams RC, Pettitt DJ, Steinberg AG. Gm3;5,13,14 and type 2 diabetes mellitus: An association in American Indians with genetic admixture. Am J Hum Genet. 1989;43:520–526. - PMC - PubMed
    1. Lander ES, Schork NJ. Genetic dissection of complex traits. Science. 1994;265:2037–2048. - PubMed
    1. Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999;55:997–1004. - PubMed
    1. Pritchard JK, Rosenberg NA. Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet. 1999;65:220–228. - PMC - PubMed

Publication types

Substances