Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Nov 6;95(5):553-64.
doi: 10.1016/j.ajhg.2014.10.005. Epub 2014 Oct 30.

PRIMUS: rapid reconstruction of pedigrees from genome-wide estimates of identity by descent

Affiliations

PRIMUS: rapid reconstruction of pedigrees from genome-wide estimates of identity by descent

Jeffrey Staples et al. Am J Hum Genet. .

Abstract

Understanding and correctly utilizing relatedness among samples is essential for genetic analysis; however, managing sample records and pedigrees can often be error prone and incomplete. Data sets ascertained by random sampling often harbor cryptic relatedness that can be leveraged in genetic analyses for maximizing power. We have developed a method that uses genome-wide estimates of pairwise identity by descent to identify families and quickly reconstruct and score all possible pedigrees that fit the genetic data by using up to third-degree relatives, and we have included it in the software package PRIMUS (Pedigree Reconstruction and Identification of the Maximally Unrelated Set). Here, we validate its performance on simulated, clinical, and HapMap pedigrees. Among these samples, we demonstrate that PRIMUS can verify reported pedigree structures and identify cryptic relationships. Finally, we show that PRIMUS reconstructed pedigrees, all of which were previously unknown, for 203 families from a cohort collected in Starr County, TX (1,890 samples).

PubMed Disclaimer

Figures

Figure 1
Figure 1
A Summary of the PRIMUS Reconstructions for 1,000 Simulated Pedigrees All simulated uniform size-20 (A) and uniform size-40 (B) pedigrees with up to 20% missing samples were reconstructed with PRIMUS. We ran 100 simulations for each size and percentage of missing samples. For each simulation, we determined where the true pedigree fell among the ranked reconstruction results. Each bar displays the proportion of the 100 simulations corresponding to the five reconstruction outcomes defined as follows: “highest scoring” means that the true pedigree was the highest-scoring pedigree; “among highest scoring” means that PRIMUS output contained more than one possible pedigree and that the true pedigree was tied with one or more other pedigrees for the highest-scoring pedigree; “among scored” indicates that the true pedigree was not the highest-scoring pedigree but was among the pedigrees generated by PRIMUS; “partial reconstruction” means that the complete reconstruction resulted in too many possible pedigrees, ran out of memory, or took longer than 36 hr to run, and as a result only a partial reconstruction using first-degree relationships was generated; and “missing” indicates that PRIMUS reconstructed one or more possible pedigrees but that the true pedigree was not among them.
Figure 2
Figure 2
A UW CMG Pedigree Correctly Reconstructed by PRIMUS in 9 s PRIMUS used chip-based genotype data to verify this clinically ascertained pedigree, which included the presence of five individuals for whom no genetic data were available (individuals marked with diagonal lines) and a cycle that occurred because individual III-3 had children with both III-2 and III-4.
Figure 3
Figure 3
Two Reported EOCOPD Study Pedigrees Verified by PRIMUS (A) This pedigree was the only pedigree generated from PRIMUS. (B) This pedigree was tied with five other pedigrees for the highest-scoring pedigree.
Figure 4
Figure 4
Two of the Six EOCOPD Study Pedigrees Corrected by PRIMUS The reported pedigrees are depicted above (A and C), and the corrected pedigrees are shown below (B and D). Reported pedigree A has a nonpaternity error, so individuals II-2 and II-3 are actually half siblings rather than full siblings in the correct pedigree B. Pedigree B was the top-ranked pedigree in the PRIMUS output. Reported pedigree C contains not only a nonpaternity error that caused individual III-1 to be incorrectly reported as a full sibling of III-2 and III-3 but also a sample swap that caused individual II-3’s DNA to be swapped for DNA of an individual from an entirely different pedigree. Corrected pedigree D was the only pedigree generated by PRIMUS. The investigators have independently confirmed the corrected pedigrees.
Figure 5
Figure 5
Relationship-Prediction Accuracies for Simulated Pedigrees with RELPAIR or PRIMUS For this comparison, we used half-sibling size-20 pedigrees with 0%–40% missing samples to test pairwise relationship-prediction accuracy. For PRIMUS, we tested whether the relationships in the highest-ranked pedigree matched the true simulated relationships. For RELPAIR, we used the method employed by Pemberton et al. to obtain the prediction and compared that to the true simulated relationship. A second-degree relationship prediction is correct if the predicted relationship type matches the true relationship type. A third-degree relationship prediction is correct if the predicted relationship degree matches the true relationship degree. A distantly and unrelated prediction is correct if the true relationship is more than a third-degree relationship.

Similar articles

  • PRIMUS: improving pedigree reconstruction using mitochondrial and Y haplotypes.
    Staples J, Ekunwe L, Lange E, Wilson JG, Nickerson DA, Below JE. Staples J, et al. Bioinformatics. 2016 Feb 15;32(4):596-8. doi: 10.1093/bioinformatics/btv618. Epub 2015 Oct 29. Bioinformatics. 2016. PMID: 26515822 Free PMC article.
  • PADRE: Pedigree-Aware Distant-Relationship Estimation.
    Staples J, Witherspoon DJ, Jorde LB, Nickerson DA; University of Washington Center for Mendelian Genomics; Below JE, Huff CD. Staples J, et al. Am J Hum Genet. 2016 Jul 7;99(1):154-62. doi: 10.1016/j.ajhg.2016.05.020. Epub 2016 Jun 30. Am J Hum Genet. 2016. PMID: 27374771 Free PMC article.
  • Profiling and Leveraging Relatedness in a Precision Medicine Cohort of 92,455 Exomes.
    Staples J, Maxwell EK, Gosalia N, Gonzaga-Jauregui C, Snyder C, Hawes A, Penn J, Ulloa R, Bai X, Lopez AE, Van Hout CV, O'Dushlaine C, Teslovich TM, McCarthy SE, Balasubramanian S, Kirchner HL, Leader JB, Murray MF, Ledbetter DH, Shuldiner AR, Yancoupolos GD, Dewey FE, Carey DJ, Overton JD, Baras A, Habegger L, Reid JG. Staples J, et al. Am J Hum Genet. 2018 May 3;102(5):874-889. doi: 10.1016/j.ajhg.2018.03.012. Am J Hum Genet. 2018. PMID: 29727688 Free PMC article.
  • Relatedness in the post-genomic era: is it still useful?
    Speed D, Balding DJ. Speed D, et al. Nat Rev Genet. 2015 Jan;16(1):33-44. doi: 10.1038/nrg3821. Epub 2014 Nov 18. Nat Rev Genet. 2015. PMID: 25404112 Review.
  • Wild pedigrees: the way forward.
    Pemberton JM. Pemberton JM. Proc Biol Sci. 2008 Mar 22;275(1635):613-21. doi: 10.1098/rspb.2007.1531. Proc Biol Sci. 2008. PMID: 18211868 Free PMC article. Review.

Cited by

  • Host genetic effects in pneumonia.
    Chen HH, Shaw DM, Petty LE, Graff M, Bohlender RJ, Polikowsky HG, Zhong X, Kim D, Buchanan VL, Preuss MH, Shuey MM, Loos RJF, Huff CD, Cox NJ, Bastarache JA, Bastarache L, North KE, Below JE. Chen HH, et al. Am J Hum Genet. 2021 Jan 7;108(1):194-201. doi: 10.1016/j.ajhg.2020.12.010. Epub 2020 Dec 13. Am J Hum Genet. 2021. PMID: 33357513 Free PMC article.
  • Large-scale multi-omics analyses in Hispanic/Latino populations identify genes for cardiometabolic traits.
    Petty LE, Chen HH, Frankel EG, Zhu W, Downie CG, Graff M, Lin P, Sharma P, Zhang X, Scartozzi AC, Roshani R, Landman JM, Boehnke M, Bowden DW, Chambers JC, Mahajan A, McCarthy MI, Ng MCY, Sim X, Spracklen CN, Zhang W, Preuss M, Bottinger EP, Nadkarni GN, Loos RJF, Chen YI, Tan J, Ipp E, Genter P, Emery LS, Louie T, Sofer T, Stilp AM, Taylor KD, Xiang AH, Buchanan TA, Roll K, Gao C, Palmer ND, Norris JM, Wagenknecht LE, Nousome D, Varma R, McKean-Cowdin R, Guo X, Hai Y, Hsueh W, Sandow K, Parra EJ, Cruz M, Valladares-Salgado A, Wacher-Rodarte N, Rotter JI, Goodarzi MO, Rich SS, Bertoni A, Raffel LJ, Nadler JL, Kandeel FR, Duggirala R, Blangero J, Lehman DM, DeFronzo RA, Thameem F, Wang Y, Gahagan S, Blanco E, Burrows R, Huerta-Chagoya A, Florez JC, Tusie-Luna T, González-Villalpando C, Orozco L, Haiman CA, Hanis CL, Rohde R, Whitsel EA, Reiner AP, Kooperberg C, Li Y, Duan Q, Lee M, Correa-Burrows P, Fried SK, North KE, McCormick JB, Fisher-Hoch SP, Gamazon ER, Morris AP, Mercader JM, Highland HM, Below JE; DIAMANTE Hispanic/Latino Consortium; Global Hispanic Lipids Consortium. Petty LE, et al. Nat Commun. 2025 Apr 11;16(1):3438. doi: 10.1038/s41467-025-58574-z. Nat Commun. 2025. PMID: 40210677 Free PMC article.
  • Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees.
    Ullah E, Mall R, Abbas MM, Kunji K, Nato AQ Jr, Bensmail H, Wijsman EM, Saad M. Ullah E, et al. Genome Res. 2019 Jan;29(1):125-134. doi: 10.1101/gr.236315.118. Epub 2018 Dec 4. Genome Res. 2019. PMID: 30514702 Free PMC article.
  • Bactrocera dorsalis in the Indian Ocean: A tale of two invasions.
    Deschepper P, Vanbergen S, Zhang Y, Li Z, Hassani IM, Patel NA, Rasolofoarivao H, Singh S, Wee SL, De Meyer M, Virgilio M, Delatte H. Deschepper P, et al. Evol Appl. 2022 Dec 1;16(1):48-61. doi: 10.1111/eva.13507. eCollection 2023 Jan. Evol Appl. 2022. PMID: 36699130 Free PMC article.
  • Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism.
    Satterstrom FK, Kosmicki JA, Wang J, Breen MS, De Rubeis S, An JY, Peng M, Collins R, Grove J, Klei L, Stevens C, Reichert J, Mulhern MS, Artomov M, Gerges S, Sheppard B, Xu X, Bhaduri A, Norman U, Brand H, Schwartz G, Nguyen R, Guerrero EE, Dias C; Autism Sequencing Consortium; iPSYCH-Broad Consortium; Betancur C, Cook EH, Gallagher L, Gill M, Sutcliffe JS, Thurm A, Zwick ME, Børglum AD, State MW, Cicek AE, Talkowski ME, Cutler DJ, Devlin B, Sanders SJ, Roeder K, Daly MJ, Buxbaum JD. Satterstrom FK, et al. Cell. 2020 Feb 6;180(3):568-584.e23. doi: 10.1016/j.cell.2019.12.036. Epub 2020 Jan 23. Cell. 2020. PMID: 31981491 Free PMC article.

References

    1. Santorico S.A., Edwards K.L. Challenges of linkage analysis in the era of whole-genome sequencing. Genet. Epidemiol. 2014;38(Suppl 1):S92–S96. - PubMed
    1. Ott J., Kamatani Y., Lathrop M. Family-based designs for genome-wide association studies. Nat. Rev. Genet. 2011;12:465–474. - PubMed
    1. Hu H., Roach J.C., Coon H., Guthery S.L., Voelkerding K.V., Margraf R.L., Durtschi J.D., Tavtigian S.V., Shankaracharya, Wu W. A unified test of linkage analysis and rare-variant association for analysis of pedigree sequence data. Nat. Biotechnol. 2014;32:663–669. - PMC - PubMed
    1. McMillin M.J., Below J.E., Shively K.M., Beck A.E., Gildersleeve H.I., Pinner J., Gogola G.R., Hecht J.T., Grange D.K., Harris D.J., University of Washington Center for Mendelian Genomics Mutations in ECEL1 cause distal arthrogryposis type 5D. Am. J. Hum. Genet. 2013;92:150–156. - PMC - PubMed
    1. Below J.E., Earl D.L., Shively K.M., McMillin M.J., Smith J.D., Turner E.H., Stephan M.J., Al-Gazali L.I., Hertecant J.L., Chitayat D., University of Washington Center for Mendelian Genomics Whole-genome analysis reveals that mutations in inositol polyphosphate phosphatase-like 1 cause opsismodysplasia. Am. J. Hum. Genet. 2013;92:137–143. - PMC - PubMed

Publication types

Supplementary concepts