On the number of siblings and p-th cousins in a large population sample
- PMID: 29876645
- DOI: 10.1007/s00285-018-1252-8
On the number of siblings and p-th cousins in a large population sample
Abstract
The number of individuals in a random sample with close relatives in the sample is a quantity of interest when designing Genome Wide Association Studies and other cohort based genetic, and non-genetic, studies. In this paper, we develop expressions for the distribution and expectation of the number of p-th cousins in a sample from a population of size N under two diploid Wright-Fisher models. We also develop simple asymptotic expressions for large values of N. For example, the expected proportion of individuals with at least one p-th cousin in a sample of K individuals, for a diploid dioecious Wright-Fisher model, is approximately [Formula: see text]. Our results show that a substantial fraction of individuals in the sample will have at least a second cousin if the sampling fraction (K / N) is on the order of [Formula: see text]. This confirms that, for large cohort samples, relatedness among individuals cannot easily be ignored.
Keywords: Cousins; Dioecious Wright–Fisher model; Pedigree; Siblings; Stirling numbers of the second kind.
Similar articles
-
Single and simultaneous binary mergers in Wright-Fisher genealogies.Theor Popul Biol. 2018 May;121:60-71. doi: 10.1016/j.tpb.2018.04.001. Epub 2018 Apr 12. Theor Popul Biol. 2018. PMID: 29655651
-
Ancestries of a recombining diploid population.J Math Biol. 2016 Jan;72(1-2):363-408. doi: 10.1007/s00285-015-0886-z. Epub 2015 Apr 30. J Math Biol. 2016. PMID: 25925241
-
The stationary distribution of a sample from the Wright-Fisher diffusion model with general small mutation rates.J Math Biol. 2019 Mar;78(4):1211-1224. doi: 10.1007/s00285-018-1306-y. Epub 2018 Nov 13. J Math Biol. 2019. PMID: 30426201
-
A review on Monte Carlo simulation methods as they apply to mutation and selection as formulated in Wright-Fisher models of evolutionary genetics.Math Biosci. 2008 Feb;211(2):205-25. doi: 10.1016/j.mbs.2007.05.015. Epub 2007 Nov 28. Math Biosci. 2008. PMID: 18190932 Review.
-
Population structure in genetic studies: Confounding factors and mixed models.PLoS Genet. 2018 Dec 27;14(12):e1007309. doi: 10.1371/journal.pgen.1007309. eCollection 2018 Dec. PLoS Genet. 2018. PMID: 30589851 Free PMC article. Review.
Cited by
-
Genetic association meta-analysis is susceptible to confounding by between-study cryptic relatedness.bioRxiv [Preprint]. 2025 May 12:2025.05.10.653279. doi: 10.1101/2025.05.10.653279. bioRxiv. 2025. PMID: 40463146 Free PMC article. Preprint.
-
RaPID: ultra-fast, powerful, and accurate detection of segments identical by descent (IBD) in biobank-scale cohorts.Genome Biol. 2019 Jul 25;20(1):143. doi: 10.1186/s13059-019-1754-8. Genome Biol. 2019. PMID: 31345249 Free PMC article.
-
Consumer genomics will change your life, whether you get tested or not.Genome Biol. 2018 Aug 20;19(1):120. doi: 10.1186/s13059-018-1506-1. Genome Biol. 2018. PMID: 30124172 Free PMC article.
-
Limitations of principal components in quantitative genetic association models for human studies.Elife. 2023 May 4;12:e79238. doi: 10.7554/eLife.79238. Elife. 2023. PMID: 37140344 Free PMC article.
-
Rapid detection of identity-by-descent tracts for mega-scale datasets.Nat Commun. 2021 Jun 10;12(1):3546. doi: 10.1038/s41467-021-22910-w. Nat Commun. 2021. PMID: 34112768 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical