Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep;38(9):1266-1276.
doi: 10.1002/humu.23265. Epub 2017 Jun 19.

Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges

Affiliations

Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges

Binghuang Cai et al. Hum Mutat. 2017 Sep.

Abstract

The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features.

Keywords: biomedical informatics; community challenge; critical assessment; genome; genome interpretation; open consent; personal genome project (PGP); phenotype.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
ROC AUC curves of the four submissions of PGP 2011
FIGURE 2
FIGURE 2
PR curves of the four submissions of PGP 2011
FIGURE 3
FIGURE 3
AUC comparison of simulative results and submission for the best submission from Group1 of PGP 2011
FIGURE 4
FIGURE 4
ROC AUC curves of the five submissions of PGP 2015
FIGURE 5
FIGURE 5
PR curves of the five submissions of PGP 2015
FIGURE 6
FIGURE 6
AUC comparison of simulative results and submission for the best submission from Group1 of PGP 2015

References

    1. 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature. 2010;467(7319):1061–1073. - PMC - PubMed
    1. Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen-2. Current Protocols in Human Genetics. 2013;(Unit7.20) Chapter 7. - PMC - PubMed
    1. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Research. 2009;19:9. - PMC - PubMed
    1. Anstee DJ. Red cell genotyping and the future of pretransfusion testing. Blood. 2009;114(2):248–256. - PubMed
    1. Ball MP, Bobe JR, Chou MF, Clegg T, Estep PW, Lunshof JE, Church GM. Harvard Personal Genome Project: Lessons from participatory public research. Genome Medicine. 2014;6(2):10. (1–7) - PMC - PubMed

Publication types