Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 8;3(9):100378.
doi: 10.1016/j.xgen.2023.100378. eCollection 2023 Sep 13.

Whole-genome sequencing across 449 samples spanning 47 ethnolinguistic groups provides insights into genetic diversity in Nigeria

Collaborators, Affiliations

Whole-genome sequencing across 449 samples spanning 47 ethnolinguistic groups provides insights into genetic diversity in Nigeria

Esha Joshi et al. Cell Genom. .

Abstract

African populations have been drastically underrepresented in genomics research, and failure to capture the genetic diversity across the numerous ethnolinguistic groups (ELGs) found on the continent has hindered the equity of precision medicine initiatives globally. Here, we describe the whole-genome sequencing of 449 Nigerian individuals across 47 unique self-reported ELGs. Population structure analysis reveals genetic differentiation among our ELGs, consistent with previous findings. From the 36 million SNPs and insertions or deletions (indels) discovered in our dataset, we provide a high-level catalog of both novel and medically relevant variation present across the ELGs. These results emphasize the value of this resource for genomics research, with added granularity by representing multiple ELGs from Nigeria. Our results also underscore the potential of using these cohorts with larger sample sizes to improve our understanding of human ancestry and health in Africa.

Keywords: Nigeria; genomics; non-communicable diseases; population genetics; precision medicine; whole-genome sequencing.

PubMed Disclaimer

Conflict of interest statement

E.J., A.B., J.P., A.Y., O.O., D.A., E.D., O.O., G.E.-E., A.E.-O., and C.O’D. were employed by 54gene, Inc., at the time this research was conducted. Funding for this study was provided by 54gene, Inc. C.O’D. is currently employed at insitro, San Francisco, CA 94080, USA. insitro had no involvement in the design or implementation of the work presented here. E.D., J.P., A.Y., and A.E.-O. are current employees of Syndicate Bio.

Figures

None
Graphical abstract
Figure 1
Figure 1
Overview of collection locations and regional designations within Nigeria Additional details of the sample collection framework are discussed elsewhere.
Figure 2
Figure 2
Collection sites in Nigeria where individuals of the 54gene dataset were sampled (A) States of origin for collected samples. Size of markers are proportional to the number of individuals collected. All states are listed in Table S1. (B) Reported ethnolinguistic group and state of origin for top 15 most prevalent groups. Marker size is in proportion to the number of individuals sampled.
Figure 3
Figure 3
Variant counts across ELGs in the 54gene dataset and population groups in the NYGC AFR cohort (A) 54gene cohort, top 15 ancestries by subject count, known (present in dbsnp154) vs. unknown (not present in dbsnp154). (B) 54gene cohort, top 15 ancestries by subject count, rare (MAF < 0.1%)/uncommon (MAF ≥ 0.1% and < 0.5%)/common (MAF ≥ 0.5%) in GnomAD AFR. (C) NYGC cohort, known (in dbsnp154) vs. unknown. (D) NYGC cohort, rare/uncommon/common in GnomAD AFR (bounds are the same as in B).
Figure 4
Figure 4
Population structure analysis using ADMIXTURE of ethnolinguistic groups listed in Table 1, alongside select populations from 1000 Genomes Project (10 random samples from African Caribbean in Barbados [ACB]; African Ancestry in Southwest USA [ASW]; Utah residents [CEPH] with Northern and Western European ancestry [CEU]; Esan in Nigeria [ESN]; Gambian in Western Division, The Gambia - Mandinka [GWD]; Luhya in Webuye, Kenya [LWK]; Mende in Sierra Leone [MSL]; Toscani in Italy [TSI]; Yoruba in Ibadan, Nigeria [YRI]) HDGP populations included were Yoruba in Nigeria (Yoruba) and Mozabite in Mzab, Algeria (Mozabite). Total sample size was n = 422.
Figure 5
Figure 5
Principal-component plot of ethnolinguistic groups listed in Table 1 in addition to Esan from 54gene and 1000 Genomes Project and Yoruba from 1000 Genomes Project and HGDP An additional version of this plot with all ethnolinguistic groups is shown in Figures S3 and S4.

References

    1. Byrska-Bishop M., Evani U.S., Zhao X., Basile A.O., Abel H.J., Regier A.A., Corvelo A., Clarke W.E., Musunuri R., Nagulapalli K., et al. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. Cell. 2022;185:3426–3440.e19. doi: 10.1016/j.cell.2022.08.004. - DOI - PMC - PubMed
    1. International HapMap Consortium The International HapMap Project. Nature. 2003;426:789–796. doi: 10.1038/nature02168. - DOI - PubMed
    1. Taliun D., Harris D.N., Kessler M.D., Carlson J., Szpiech Z.A., Torres R., Taliun S.A.G., Corvelo A., Gogarten S.M., Kang H.M., et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature. 2021;590:290–299. doi: 10.1038/s41586-021-03205-y. - DOI - PMC - PubMed
    1. Tishkoff S.A., Reed F.A., Friedlaender F.R., Ehret C., Ranciaro A., Froment A., Hirbo J.B., Awomoyi A.A., Bodo J.-M., Doumbo O., et al. The genetic structure and history of Africans and African Americans. Science. 2009;324:1035–1044. doi: 10.1126/science.1172257. - DOI - PMC - PubMed
    1. United Nations Department of Economic and Social Affairs, Population Division . 2022. World Population Prospects.

LinkOut - more resources