Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Nov 6;95(5):553-64.
doi: 10.1016/j.ajhg.2014.10.005. Epub 2014 Oct 30.

PRIMUS: rapid reconstruction of pedigrees from genome-wide estimates of identity by descent

Affiliations

PRIMUS: rapid reconstruction of pedigrees from genome-wide estimates of identity by descent

Jeffrey Staples et al. Am J Hum Genet. .

Abstract

Understanding and correctly utilizing relatedness among samples is essential for genetic analysis; however, managing sample records and pedigrees can often be error prone and incomplete. Data sets ascertained by random sampling often harbor cryptic relatedness that can be leveraged in genetic analyses for maximizing power. We have developed a method that uses genome-wide estimates of pairwise identity by descent to identify families and quickly reconstruct and score all possible pedigrees that fit the genetic data by using up to third-degree relatives, and we have included it in the software package PRIMUS (Pedigree Reconstruction and Identification of the Maximally Unrelated Set). Here, we validate its performance on simulated, clinical, and HapMap pedigrees. Among these samples, we demonstrate that PRIMUS can verify reported pedigree structures and identify cryptic relationships. Finally, we show that PRIMUS reconstructed pedigrees, all of which were previously unknown, for 203 families from a cohort collected in Starr County, TX (1,890 samples).

PubMed Disclaimer

Figures

Figure 1
Figure 1
A Summary of the PRIMUS Reconstructions for 1,000 Simulated Pedigrees All simulated uniform size-20 (A) and uniform size-40 (B) pedigrees with up to 20% missing samples were reconstructed with PRIMUS. We ran 100 simulations for each size and percentage of missing samples. For each simulation, we determined where the true pedigree fell among the ranked reconstruction results. Each bar displays the proportion of the 100 simulations corresponding to the five reconstruction outcomes defined as follows: “highest scoring” means that the true pedigree was the highest-scoring pedigree; “among highest scoring” means that PRIMUS output contained more than one possible pedigree and that the true pedigree was tied with one or more other pedigrees for the highest-scoring pedigree; “among scored” indicates that the true pedigree was not the highest-scoring pedigree but was among the pedigrees generated by PRIMUS; “partial reconstruction” means that the complete reconstruction resulted in too many possible pedigrees, ran out of memory, or took longer than 36 hr to run, and as a result only a partial reconstruction using first-degree relationships was generated; and “missing” indicates that PRIMUS reconstructed one or more possible pedigrees but that the true pedigree was not among them.
Figure 2
Figure 2
A UW CMG Pedigree Correctly Reconstructed by PRIMUS in 9 s PRIMUS used chip-based genotype data to verify this clinically ascertained pedigree, which included the presence of five individuals for whom no genetic data were available (individuals marked with diagonal lines) and a cycle that occurred because individual III-3 had children with both III-2 and III-4.
Figure 3
Figure 3
Two Reported EOCOPD Study Pedigrees Verified by PRIMUS (A) This pedigree was the only pedigree generated from PRIMUS. (B) This pedigree was tied with five other pedigrees for the highest-scoring pedigree.
Figure 4
Figure 4
Two of the Six EOCOPD Study Pedigrees Corrected by PRIMUS The reported pedigrees are depicted above (A and C), and the corrected pedigrees are shown below (B and D). Reported pedigree A has a nonpaternity error, so individuals II-2 and II-3 are actually half siblings rather than full siblings in the correct pedigree B. Pedigree B was the top-ranked pedigree in the PRIMUS output. Reported pedigree C contains not only a nonpaternity error that caused individual III-1 to be incorrectly reported as a full sibling of III-2 and III-3 but also a sample swap that caused individual II-3’s DNA to be swapped for DNA of an individual from an entirely different pedigree. Corrected pedigree D was the only pedigree generated by PRIMUS. The investigators have independently confirmed the corrected pedigrees.
Figure 5
Figure 5
Relationship-Prediction Accuracies for Simulated Pedigrees with RELPAIR or PRIMUS For this comparison, we used half-sibling size-20 pedigrees with 0%–40% missing samples to test pairwise relationship-prediction accuracy. For PRIMUS, we tested whether the relationships in the highest-ranked pedigree matched the true simulated relationships. For RELPAIR, we used the method employed by Pemberton et al. to obtain the prediction and compared that to the true simulated relationship. A second-degree relationship prediction is correct if the predicted relationship type matches the true relationship type. A third-degree relationship prediction is correct if the predicted relationship degree matches the true relationship degree. A distantly and unrelated prediction is correct if the true relationship is more than a third-degree relationship.

References

    1. Santorico S.A., Edwards K.L. Challenges of linkage analysis in the era of whole-genome sequencing. Genet. Epidemiol. 2014;38(Suppl 1):S92–S96. - PubMed
    1. Ott J., Kamatani Y., Lathrop M. Family-based designs for genome-wide association studies. Nat. Rev. Genet. 2011;12:465–474. - PubMed
    1. Hu H., Roach J.C., Coon H., Guthery S.L., Voelkerding K.V., Margraf R.L., Durtschi J.D., Tavtigian S.V., Shankaracharya, Wu W. A unified test of linkage analysis and rare-variant association for analysis of pedigree sequence data. Nat. Biotechnol. 2014;32:663–669. - PMC - PubMed
    1. McMillin M.J., Below J.E., Shively K.M., Beck A.E., Gildersleeve H.I., Pinner J., Gogola G.R., Hecht J.T., Grange D.K., Harris D.J., University of Washington Center for Mendelian Genomics Mutations in ECEL1 cause distal arthrogryposis type 5D. Am. J. Hum. Genet. 2013;92:150–156. - PMC - PubMed
    1. Below J.E., Earl D.L., Shively K.M., McMillin M.J., Smith J.D., Turner E.H., Stephan M.J., Al-Gazali L.I., Hertecant J.L., Chitayat D., University of Washington Center for Mendelian Genomics Whole-genome analysis reveals that mutations in inositol polyphosphate phosphatase-like 1 cause opsismodysplasia. Am. J. Hum. Genet. 2013;92:137–143. - PMC - PubMed

Publication types

Supplementary concepts