Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Aug 10:5:321-339.
doi: 10.1146/annurev-biodatasci-122220-112550. Epub 2022 May 16.

Importance of Including Non-European Populations in Large Human Genetic Studies to Enhance Precision Medicine

Affiliations
Review

Importance of Including Non-European Populations in Large Human Genetic Studies to Enhance Precision Medicine

Dan Ju et al. Annu Rev Biomed Data Sci. .

Abstract

One goal of genomic medicine is to uncover an individual's genetic risk for disease, which generally requires data connecting genotype to phenotype, as done in genome-wide association studies (GWAS). While there may be clinical promise to employing prediction tools such as polygenic risk scores (PRS), it currently stands that individuals of non-European ancestry may not reap the benefits of genomic medicine because of underrepresentation in large-scale genetics studies. Here, we discuss why this inequity poses a problem for genomic medicine and the reasons for the low transferability of PRS across populations. We also survey the ancestry representation of published GWAS and investigate how estimates of ancestry diversity in GWASparticipants might be biased. We highlight the importance of expanding genetic research in Africa, one of the most underrepresented regions in human genomics research, and discuss issues of ethics, resources, and technology for equitable advancement of genomic medicine.

Keywords: Africa; GWAS; diversity; human genomics; polygenic risk score; precision medicine.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) The cumulative number of European and non-European ancestry studies and participants each year from the GWAS Catalog. (b) The percentage per year of participants of European ancestry and of studies and publications that used exclusively European ancestry individuals. The estimates of numbers of individuals here were calculated without accounting for repeated sampling.
Figure 2
Figure 2
(a) The number of participants in individual height GWAS over time. Points are colored by whether the participants in the study were of European ancestry and assigned a shape based on whether the data are from a major cohort or consortium. (b) The proportion of all participants of height GWAS that are of European and non-European ancestry based on (top) a naïve (i.e., not accounting for repeated sampling of individuals) estimate from the GWAS Catalog and (bottom) an estimate after removing repeatedly sampled individuals across height studies. (c) Proportion of all unique participants from height GWAS for each GWAS Catalog ancestry category other than European. Abbreviations: BBJ, BioBank Japan; GIANT, Genetic Investigation of Anthropometric Traits; GWAS, genome-wide association study; UKB, UK Biobank.
Figure 3
Figure 3
World map of cohorts with genetic and phenotypic data that are part of the International HundredK+ Cohorts Consortium (IHCC). Countries with cohorts are highlighted and individual cohorts are sized according to the number of individuals enrolled. For cohorts with ongoing enrollment, empty circles are drawn according to the target number of individuals. Abbreviations: AHRI, Africa Health Research Institute Population Cohort; AMORIS, Apolipoprotein Mortality Risk Study; BBJ, BioBank Japan; BioVU, Biobank Vanderbilt University; CanPath, Canadian Partnership for Tomorrow’s Health; CHOP, Children’s Hospital of Philadelphia Biorepository; CKB, China Kadoorie Biobank; CPS-II, Cancer Prevention Study II; CPS-II Nutr., CPS-II Nutrition Cohort; CTS, California Teachers Study; DNBC, Danish National Birth Cohort; ECHO, Environmental Influences on Child Health Outcomes Cohort; EGP, Estonian Genome Project; ELGH, East London Genes and Health; ELSA, Estudo Longitudinal de Saúde do Adulto; EPIC, European Prospective Investigation into Cancer, Chronic Diseases, Nutrition and Lifestyle; Geisinger, Geisinger MyCode Community Health Initiative; GenEngland, Genomics England/100,000 Genomes Project; GS, Generations Study; HN, Healthy Nevada; HUNT, Trøndelag Health Study; IsraelGen, Israel Genome Project; JPHC, Japan Public Health Center–Based Prospective Study; KBP, Korea Biobank Project; KCPS-II, Korean Cancer Prevention Study; KoGES, Korean Genome and Epidemiology Study; KPRB, Kaiser Permanente Research Bank; LIFEPATH, Lifecourse Biological Pathways Underlying Social Differences in Healthy Aging Study; MAUCO, Maule Cohort; MC, Malaysian Cohort; MEC, Multiethnic Cohort Study; MoBa, Norwegian Mother and Child Cohort Study; MVP, Million Veteran Program; NFLC, Norwegian Family-Based Life Course Study; NHS, Nurses’ Health Study; NHSII, Nurses’ Health Study II; NICCC, National Israeli Cancer Control Center; NLGP, Newfoundland 100K Genome Project; NSHDS, Northern Sweden Health and Disease Study; NTR, Netherlands Twin Registry; PLCO, Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial; QB, Qatar Biobank; SE-NETWORK, South(east) Asian Cohorts NETWORK; SNPMP, Singapore National Precision Medicine Program; TWB, Taiwan Biobank; UKB, UK Biobank; UKBDC, UK Blood Donor Cohorts; WHI, Women’s Health Initiative.
Figure 4
Figure 4
The sample sizes of the largest GWAS, at the time of writing, for (a) Alzheimer’s disease, (b) asthma, (c) BMI, (d) coronary artery disease, (e) systolic blood pressure, (f) schizophrenia, and (g) type 2 diabetes, across ancestry categories based on the GWAS Catalog ancestry ontology. Abbreviations: AA, African American or Afro-Caribbean; AMR, Native American; BMI, body mass index; EAS, East Asian; EUR, European; GME, Greater Middle Eastern; GWAS, genome-wide association studies; HLA, Hispanic or Latin American; OC, Oceanian; SA, South Asian; SEA, Southeast Asian; SSA, sub-Saharan African.

References

    1. Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, et al. 2017. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet 101(1):5–22 - PMC - PubMed
    1. Lawlor DA, Harbord RM, Sterne JAC, Timpson N, Davey Smith G. 2008. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat. Med 27(8):1133–63 - PubMed
    1. 100,000 Genomes Proj. Pilot Investig. 2021. 100,000 Genomes pilot on rare-disease diagnosis in health care—preliminary report. N. Engl. J. Med 385(20):1868–80 - PMC - PubMed
    1. Sabatine MS. 2019. PCSK9 inhibitors: clinical evidence and implementation. Nat. Rev. Cardiol 16(3):155–65 - PubMed
    1. Frangoul H, Altshuler D, Cappellini MD, Chen Y-S, Domm J, et al. 2021. CRISPR-Cas9 gene editing for sickle cell disease and β-thalassemia. N. Engl. J. Med 384(3):252–60 - PubMed

Publication types

LinkOut - more resources