Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 28;17(9):e0264657.
doi: 10.1371/journal.pone.0264657. eCollection 2022.

GWAS in the southern African context

Affiliations

GWAS in the southern African context

Yolandi Swart et al. PLoS One. .

Abstract

Researchers would generally adjust for the possible confounding effect of population structure by considering global ancestry proportions or top principle components. Alternatively, researchers would conduct admixture mapping to increase the power to detect variants with an ancestry effect. This is sufficient in simple admixture scenarios, however, populations from southern Africa can be complex multi-way admixed populations. Duan et al. (2018) first described local ancestry adjusted allelic (LAAA) analysis as a robust method for discovering association signals, while producing minimal false positive hits. Their simulation study, however, was limited to a two-way admixed population. Realizing that their findings might not translate to other admixture scenarios, we simulated a three- and five-way admixed population to compare the LAAA model to other models commonly used in genome-wide association studies (GWAS). We found that, given our admixture scenarios, the LAAA model identifies the most causal variants in most of the phenotypes we tested across both the three-way and five-way admixed populations. The LAAA model also produced a high number of false positive hits which was potentially caused by the ancestry effect size that we assumed. Considering the extent to which the various models tested differed in their results and considering that the source of a given association is unknown, we recommend that researchers use multiple GWAS models when analysing populations with complex ancestry.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The demographic histories of the simulated populations (Nama and SAC).
Thick solid lines with open arrowheads indicate an ancestor -> descendant relation, dashed lines indicate an admixture pulse and faint solid lines with closed arrowheads indicate continuous migration. The two admixture events for the Nama is indicated with the dashed lines with arrowheads (purple and light pink). The one admixture event for the SAC is indicated by the four solid lines with open arrowheads (green, dark pink, light pink and light brown).
Fig 2
Fig 2. Overview of the methods used to simulate genotypes using the software msprime and phenotypes using the software PhenotypeSimulator.
Fig 3
Fig 3. Comparison of true positive hits (left) and false positive hits (right) for the Nama with inferred local ancestry used in GWAS models.
The average hits for three runs with different causal SNPs are shown. The simulated phenotypes are denoted as “phenotype-ancestral source of association”, e.g.,LAAA-LWK means the LAAA phenotype with the LWK ancestral component as the ancestral source of association. The various GWAS models used are indicated in different colours. Grey represents the APA model, yellow represents the GA model, blue represents the LA model, green represents the LAAA model and the orange represents the Standard model.
Fig 4
Fig 4. Comparison of true positive hits (left) and false positive hits (right) for the Nama with true local ancestry used in GWAS models.
The average hits for three runs with different causal SNPs are shown. The simulated phenotypes are denoted as “phenotype-ancestral source of association”, e.g.,LAAA-LWK means the LAAA phenotype with the LWK ancestral component as the ancestral source of association. The various GWAS models used are indicated in different colours. Grey represents the APA model, yellow represents the GA model, blue represents the LA model, green represents the LAAA model and the orange represents the Standard model.
Fig 5
Fig 5. Comparison of true positive hits (left) and false positive hits (right) for the SAC with inferred local ancestry used in GWAS models.
The average hits for three runs with different causal SNPs are shown. The simulated phenotypes are denoted as “phenotype-ancestral source of association”, e.g.,LAAA-CHB means the LAAA phenotype with the CHB ancestral component as the ancestral source of association. The various GWAS models used are indicated in different colours. Grey represents the APA model, yellow represents the GA model, blue represents the LA model, green represents the LAAA model and the orange represents the Standard model.
Fig 6
Fig 6. Comparison of true positive hits (left) and false positive hits (right) for the SAC with true ancestry used in GWAS models.
The average hits for three runs with different causal SNPs are shown. The simulated phenotypes are denoted as “phenotype-ancestral source of association”, e.g.,LAAA-CHB means the LAAA phenotype with the CHB ancestral components as the ancestral source of association. The various GWAS models used are indicated in different colours. Grey represents the APA model, yellow represents the GA model, blue represents the LA model, green represents the LAAA model and the orange represents the Standard model.

References

    1. Brown LA, Sofer T, Stilp AM, Baier LJ, Kramer HJ, Masindova I, et al.. Admixture Mapping Identifies an Amerindian Ancestry Locus Associated with Albuminuria in Hispanics in the United States. J Am Soc Nephrol. 2017;28: 2211–2220. doi: 10.1681/ASN.2016091010 - DOI - PMC - PubMed
    1. Suarez-Pajes E, Díaz-de Usera A, Marcelino-Rodríguez I, Guillen-Guio B, Flores C. Genetic ancestry inference and its application for the genetic mapping of human diseases. Int J Mol Sci. 2021;22. doi: 10.3390/ijms22136962 - DOI - PMC - PubMed
    1. Schubert R, Andaleon A, Wheeler HE. Comparing local ancestry inference models in populations of two- and three-way admixture. PeerJ. 2020;8: e10090. doi: 10.7717/peerj.10090 - DOI - PMC - PubMed
    1. Sengupta D, Choudhury A, Fortes-Lima C, Aron S, Whitelaw G, Bostoen K, et al.. Genetic substructure and complex demographic history of South African Bantu speakers. Nat Commun. 2021;12: 2080. doi: 10.1038/s41467-021-22207-y - DOI - PMC - PubMed
    1. Atkinson EG, Dalvie S, Pichkar Y, Kalungi A, Majara L, Stevenson A, et al.. Genetic structure correlates with ethnolinguistic diversity in eastern and southern Africa. BioRxiv. 2021; doi: 10.1101/2021.05.19.444732 - DOI - PMC - PubMed

Publication types