Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr 1;36(4):852-860.
doi: 10.1093/molbev/msz008.

LFMM 2: Fast and Accurate Inference of Gene-Environment Associations in Genome-Wide Studies

Affiliations

LFMM 2: Fast and Accurate Inference of Gene-Environment Associations in Genome-Wide Studies

Kevin Caye et al. Mol Biol Evol. .

Abstract

Gene-environment association (GEA) studies are essential to understand the past and ongoing adaptations of organisms to their environment, but those studies are complicated by confounding due to unobserved demographic factors. Although the confounding problem has recently received considerable attention, the proposed approaches do not scale with the high-dimensionality of genomic data. Here, we present a new estimation method for latent factor mixed models (LFMMs) implemented in an upgraded version of the corresponding computer program. We developed a least-squares estimation approach for confounder estimation that provides a unique framework for several categories of genomic data, not restricted to genotypes. The speed of the new algorithm is several order faster than existing GEA approaches and then our previous version of the LFMM program. In addition, the new method outperforms other fast approaches based on principal component or surrogate variable analysis. We illustrate the program use with analyses of the 1000 Genomes Project data set, leading to new findings on adaptation of humans to their environment, and with analyses of DNA methylation profiles providing insights on how tobacco consumption could affect DNA methylation in patients with rheumatoid arthritis. Software availability: Software is available in the R package lfmm at https://bcm-uga.github.io/lfmm/.

Keywords: confounding factors; ecological genomics; gene-environment association; local adaptation; statistical methods.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
Base 10 logarithm of the ratio of runtimes for LFMM 1.5 and LFMM 2.0. A value of 5 means that LFMM 2.0 runs 105 times faster than LFMM 1.5. (A) n = 100 individuals, (B) n = 400, (C) n = 1,000, p is the total number of markers in the simulation.
<sc>Fig</sc>. 2.
Fig. 2.
True discovery rate and F-score as a function of confounding intensity. Three fast methods are considered: CATE, LFMM 2.0 and PCA. All methods were applied with K = 8 factors as determined by a PCA screeplot. The F-score is the harmonic mean of the true discovery rate (precision) and power.
<sc>Fig</sc>. 3.
Fig. 3.
Human GEA study. Association study based on genomic data from the 1000 Genomes Project database and climatic data from the Worldclim database. (A) Latent factors estimated by LFMM 2.0. (B) Target genes corresponding to top hits of the GEA analysis (expected FDR level of 5%). The highlighted genes correspond to functional variants. (C) Predictions obtained from the VEP program.
<sc>Fig</sc>. 4.
Fig. 4.
EWAS of RA and smoking. Fisher’s scores for CpG sites showing significant association with RA and smoking in at least two of three approaches (PCA, CATE, LFMM 2.0).

Similar articles

Cited by

References

    1. Carvalho CM, Chang J, Lucas JE, Nevins JR, Wang Q, West M.. 2008. High-dimensional sparse factor modeling: applications in gene expression genomics. J Am Stat Assoc. 103(484):1438–1456. - PMC - PubMed
    1. Chen XG, Ma L, Xu JX.. 2018. Abnormal DNA methylation may contribute to the progression of osteosarcoma. Mol Med Rep. 17(1):193–199. - PMC - PubMed
    1. De Mita S, Thuillet AC, Gay L, Ahmadi N, Manel S, Ronfort J, Vigouroux Y.. 2013. Detecting selection along environmental gradients: analysis of eight methods and their effectiveness for outbreeding and selfing populations. Mol Ecol. 22(5):1383–1399. - PubMed
    1. Devlin B, Roeder K.. 1999. Genomic control for association studies. Biometrics 55(4):997–1004. - PubMed
    1. Di Giuseppe D, Discacciati A, Orsini N, Wolk A.. 2014. Cigarette smoking and risk of rheumatoid arthritis: a dose-response meta-analysis. Arthritis Res Ther. 16(2):R61.. - PMC - PubMed

Publication types