Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 19;7(1):174.
doi: 10.1038/s42003-023-05708-y.

The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities

Collaborators, Affiliations

The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities

Eric Venner et al. Commun Biol. .

Erratum in

Abstract

Disparities in data underlying clinical genomic interpretation is an acknowledged problem, but there is a paucity of data demonstrating it. The All of Us Research Program is collecting data including whole-genome sequences, health records, and surveys for at least a million participants with diverse ancestry and access to healthcare, representing one of the largest biomedical research repositories of its kind. Here, we examine pathogenic and likely pathogenic variants that were identified in the All of Us cohort. The European ancestry subgroup showed the highest overall rate of pathogenic variation, with 2.26% of participants having a pathogenic variant. Other ancestry groups had lower rates of pathogenic variation, including 1.62% for the African ancestry group and 1.32% in the Latino/Admixed American ancestry group. Pathogenic variants were most frequently observed in genes related to Breast/Ovarian Cancer or Hypercholesterolemia. Variant frequencies in many genes were consistent with the data from the public gnomAD database, with some notable exceptions resolved using gnomAD subsets. Differences in pathogenic variant frequency observed between ancestral groups generally indicate biases of ascertainment of knowledge about those variants, but some deviations may be indicative of differences in disease prevalence. This work will allow targeted precision medicine efforts at revealed disparities.

PubMed Disclaimer

Conflict of interest statement

E.V. owns shares in Codified Genomics, a provider of genetic interpretation software. All BCM-affiliated authors declare that Baylor Genetics is a BCM affiliate that derives revenue from genetic testing. All other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Pathogenic variants by ancestry.
Using a database of known pathogenic mutations and annotations for rare, pLoF variants, we searched the beta release of the All of Us cohort for pathogenic variants, on the Researcher Workbench. Figure 1a shows the rates of pathogenic variation, broken down by predicted genetic ancestry groups. Error bars show 95% the confidence intervals for the total set of pathogenic variants (including both VIP P/LP variants and rare pLoF). Figure 1b shows the breakdown of pathogenic variants by disease area. The blue line and bar depict the rate of Pathogenic and Likely pathogenic variants, the gray bar the rate of novel, predicted loss of function variants and the orange bar depicts predicted loss of function variants that were known pathogenic variants at the time of analysis. The yellow line shows the total variants in each ancestry group.
Fig. 2
Fig. 2. Relative positive rates for All of Us vs gnomAD.
This figure shows relative frequencies of previously-curated pathogenic or likely pathogenic variants between the All of Us cohort and gnomAD, broken down by gene and ancestry group. Overall, there is a high level of concordance between variant frequencies of pathogenic variants; most genes show very small differences relative to gnomAD. Ancestries are shown as Dark blue for African, Orange for Latino / Admixed American, Gray for East Asian, Yellow for European and light blue for Other.
Fig. 3
Fig. 3. Comparisons to gnomAD subsets.
Though there is high level concordance between the rates of pathogenic variants in All of Us and gnomAD, in some cases there are differences specific to a gene and ancestry group. For example, in participants with African ancestry, the rate of pathogenic variants in BRCA2 diverges from gnomAD (a). However, when the non-cancer subgroup of gnomAD is used, the rates are much more similar. A similar situation is seen in the Admixed American / Latino ancestry group with LDLR (b). Using the non-TopMed portion of gnomAD brings the rates much closer.

References

    1. Miga KH, Wang T. The need for a human pangenome reference sequence. Annu. Rev. Genom. Hum. Genet. 2021;22:81–102. doi: 10.1146/annurev-genom-120120-081921. - DOI - PMC - PubMed
    1. Sirugo, G., Williams, S. M. & Tishkoff, S. A. The missing diversity in human genetic studies. Cell177 1080 (2019). - PMC - PubMed
    1. Popejoy AB, Fullerton SM. Genomics is failing on diversity. Nature. 2016;538:161–164. doi: 10.1038/538161a. - DOI - PMC - PubMed
    1. Carlson CS. Diversity is future for genetic analysis. Nature. 2016;540:341–341. doi: 10.1038/540341d. - DOI - PubMed
    1. Abul-Husn NS, Kenny EE. Personalized medicine and the power of electronic health records. Cell. 2019;177:58–69. doi: 10.1016/j.cell.2019.02.039. - DOI - PMC - PubMed

Publication types