Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun;57(6):1402-1417.
doi: 10.1038/s41588-025-02100-w. Epub 2025 May 21.

Genome-wide association study of long COVID

Vilma Lammi #  1 Tomoko Nakanishi #  1   2   3   4   5   6 Samuel E Jones #  1 Shea J Andrews  7 Juha Karjalainen  1   8   9   10 Beatriz Cortés  11   12 Heath E O'Brien  13 Ana Ochoa-Guzman  14 Brian E Fulton-Howard  15 Martin Broberg  1 Hele H Haapaniemi  1 Masahiro Kanai  16   17 Matti Pirinen  1   18   19 Axel Schmidt  20 Ruth E Mitchell  21   22 Abdou Mousas  23 Massimo Mangino  24 Alicia Huerta-Chagoya  25   26   27   28 Nasa Sinnott-Armstrong  29   30   31 Elizabeth T Cirulli  32 Marc Vaudel  33   34   35 Alex S F Kwong  36 Amit K Maiti  37 Minttu M Marttila  38   39 Daniel C Posner  40 Alexis A Rodriguez  41 Chiara Batini  42   43 Francesca Minnai  44   45 Anna R Dearman  46 C A Robert Warmerdam  47   48 Celia B Sequeros  49 Thomas W Winkler  50 Daniel M Jordan  51   52 Raimonds Rešcenko  53 Lorenzo Miano  54 Jacqueline M Lane  55   56   57 Ryan K Chung  58 Beatriz Guillen-Guio  42   59 Olivia C Leavy  42   59 Laura Carvajal-Silva  60 Kevin Aguilar-Valdés  60 Erika Frangione  61 Lindsay Guare  62 Ekaterina Vergasova  63 Eirini Marouli  64 Pasquale Striano  65 Ummu Afeera Zainulabid  66 Ashutosh Kumar  67 Hajar Fauzan Ahmad  68 Ryuya Edahiro  69   70 Shuhei Azekawa  71   72 Long COVID Host Genetics InitiativeFinnGenVA Million Veteran ProgramMexGen-COVID InitiativeDBDS Genomic ConsortiumGEN-COVID Multicenter StudyPHOSP-COVID Collaborative GroupGENCOV StudyEstonian Biobank Research TeamShiuh-Wen Luoh  73   74 Christian Erikstrup  75 Ole B V Pedersen  76 Jordan Lerner-Ellis  61   77   78 Alicia Colombo  60   79   80 Joseph J Grzymski  81 Makoto Ishii  71   72 Yukinori Okada  5   69   82   83 Noam D Beckmann  51   84 Meena Kumari  46 Ralf Wagner  85   86 Iris M Heid  50 Catherine John  42   87 Patrick J Short  13 Per Magnus  88 Laura Ansone  53 Luca V C Valenti  89   90 Sulggi A Lee  91 Louise V Wain  42   59 Ricardo A Verdugo  60   92 Karina Banasik  49 Frank Geller  93 Lude H Franke  47   48 Alexander Rakitko  63 Emma L Duncan  94   95 Alessandra Renieri  96   97   98 Konstantinos K Tsilidis  23   99 Rafael de Cid  11   12 Ahmadreza Niavarani  100 Erik Abner  101 Teresa Tusié-Luna  102   103 Shefali S Verma  62 George Davey Smith  21   22 Nicholas J Timpson  21   22 Ravi K Madduri  104 Kelly Cho  105 Mark J Daly  1   16   17 Andrea Ganna  1   16   17 Eva C Schulte  106   107   108   109   110 J Brent Richards  2   3   24   111   112 Kerstin U Ludwig  20 Michael Marks-Hultström  112   113   114 Hugo Zeberg  115   116 Hanna M Ollila  117   118   119   120
Collaborators, Affiliations

Genome-wide association study of long COVID

Vilma Lammi et al. Nat Genet. 2025 Jun.

Abstract

Infections can lead to persistent symptoms and diseases such as shingles after varicella zoster or rheumatic fever after streptococcal infections. Similarly, severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) infection can result in long coronavirus disease (COVID), typically manifesting as fatigue, pulmonary symptoms and cognitive dysfunction. The biological mechanisms behind long COVID remain unclear. We performed a genome-wide association study for long COVID including up to 6,450 long COVID cases and 1,093,995 population controls from 24 studies across 16 countries. We discovered an association of FOXP4 with long COVID, independent of its previously identified association with severe COVID-19. The signal was replicated in 9,500 long COVID cases and 798,835 population controls. Given the transcription factor FOXP4's role in lung physiology and pathology, our findings highlight the importance of lung function in the pathophysiology of long COVID.

PubMed Disclaimer

Conflict of interest statement

Competing interests: S.B. has ownerships in Intomics A/S, Hoba Therapeutics Aps, Novo Nordisk A/S, Lundbeck A/S, ALK abello A/S, Eli Lilly and Co and is managing board memberships in Proscion A/S and Intomics A/S. A.B., K.M.S.B., S.W., N.L.W., F.T., E.S. and E.T.C. are employees of Helix. A.D. received an honorarium from Gilead Sciences. A.L.G. and C.J. have funded research collaborations with Orion for collaborative research projects outside the submitted work. T.H. and H.E.O.B. have options in Sano Genetics. P.J.S. is a shareholder of Sano Genetics. T.H.K. has received consulting fees from Albireo, Boehringer Ingelheim, MSD and Falk Pharma. K.U.L. is cofounder and member of the scientific board of LAMPseq Diagnostics GmbH. T.N. has received speaking fee from Boehringer Ingelheim for talks unrelated to this research. M.E.K.N. is a current employee of Novartis Pharma AG. J.B.R.’s institution has received investigator-initiated grant funding from Eli Lilly, GlaxoSmithKline and Biogen for projects unrelated to this research. He is the CEO of 5 Prime Sciences ( www.5primesciences.com ), which provides research services for biotech, pharma and venture capital companies for projects unrelated to this research. V.F. is an employee of 5 Prime Sciences. C.D.S. reports grants and personal fees from AstraZeneca, Janssen-Cilag and ViiV Healthcare, personal fees and nonfinancial support from BBraun Melsungen, grants, personal fees and nonfinancial support from Gilead Sciences, personal fees from BioNtech, Eli Lilly, Formycon, Pfizer, Roche, Apeiron, GSK, Molecular partners, SOBI, AbbVie, MSD and Synairgen and grants from Cepheid. L.V.W. reports research funding from GlaxoSmithKline, Genentech and Orion Pharma, and consultancy for Galapagos and GlaxoSmithKline, outside of the submitted work. J.W. is a consultant for Roboscreen GmbH, Biogen GmbH, Immungenetics AG, Noselab GmbH, Roche Diagnostics International, Roche Pharma AG, Janssen-Cilag GmbH, Eisai GmbH, Boehringer Ingelheim and Lilly Deutschland GmbH and has received honoraries from Eisai GmbH, Biogen GmbH, AGNP e. V., Veranex, Med Update GmbH, Guangzhou Gloryren Medical Technology (China), Pfizer Pharma GmbH, Fachverband Rheumatologische Fachassistenz e. V., AWO Psychiatrie Akademie gGmbH, Neuroakademie E. V., Beijing Yibai Science und Technology Ltd., Abbott Laboratories GmbH, Lilly Deutschland GmbH, Simon & Kucher and streamedup! GmbH. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Geographic overview of studies contributing to the Long COVID HGI.
The 24 studies contributing to the Long COVID HGI data freeze 4 served as the discovery cohorts for the GWAS meta-analyses. Each color represents a meta-analysis with specific case and control definitions. Strict case definition, long COVID after test-verified SARS-CoV-2 infection; broad case definition, long COVID after any SARS-CoV-2 infection; strict control definition, individuals that had SARS-CoV-2 but did not develop long COVID; broad control definition, population control, that is, all individuals in each study that did not meet the long COVID criteria. Effective sample sizes are shown as the size of each diamond shape, and locations of sample collection in (from left to right) North America, Europe, Middle East and Asia. For more detailed sample sizes, see Supplementary Table 11.
Fig. 2
Fig. 2. Meta-analysis of 11 GWAS studies of long COVID shows an association at the FOXP4 locus.
a, Manhattan plot of long COVID after test-verified SARS-CoV-2 infection (strict case definition, n = 3,018) compared to all other individuals in each dataset (population controls, broad control definition, n = 994,582). A genome-wide significant association with long COVID was found in the chromosome 6, upstream of the FOXP4 gene (chr6: 41,515,652 G:C, GRCh38, rs9367106, as the lead variant; P = 1.76 × 1010, Bonferroni P = 7.06 × 1010, increased risk with the C allele, OR = 1.63, 95% CI = 1.40–1.89). Horizontal lines indicate genome-wide significance thresholds for IVW meta-analysis before (P < 5 × 108, dashed line) and after (1.25 × 108) Bonferroni correction over the four long COVID meta-analyses (INCMNSZ = MexGen-COVID Initiative). b, Chromosome 6 lead variant across the contributing studies and ancestries in GWAS meta-analyses of long COVID with strict case definition and broad control definition. Lead variant rs9367106 (solid line) and if missing, imputed by the variant with the highest LD with the lead variant for illustrative purpose, that is, rs12660421 (r = 0.98 in European in 1,000 G + HGDP samples, dotted lines). For the imputed variants, β was weighted by multiplying by the LD correlation coefficient (r = 0.98). Centre, OR; error bar, 95% CI. Genetic ancestries marked by colors. MAF varies across ancestries, ranging from 1% to 34% (Supplementary Fig. 4). AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; UKBB, UK Biobank. (Results for the other three GWAS meta-analyses in Supplementary Figs. 2 and 3a–c).
Fig. 3
Fig. 3. The chromosome 6 region (chr6: 41,490,001–41,560,000 (70 kb); FOXP4 locus) in the long COVID GWAS meta-analysis.
Long COVID meta-analysis with strict case (n = 3,018) and broad control (n = 994,582) definition (Fig. 2). X axis shows the position on chromosome 6 (GRCh38). The long COVID lead variant (rs9367106) is depicted with a triangle in each plot. a, Locus zoom plot with each variant colored by effective sample size and showing statistical significance (IVW GWAS meta-analysis −log10 P value) on y axis. b, Each variant colored by statistical significance and showing effect sizes (center, coefficients; error bar, 95% CI on y axis). c, Each variant colored by ancestry and showing LD correlation coefficient (r) with the long COVID lead variant on y axis. d, Ensembl genes in the region (FOXP4 not fully shown; www.ensembl.org).
Fig. 4
Fig. 4. FOXP4 expression in the lung.
a, The lead variant rs9367106 was not found in the GTEx dataset, but a proxy variant (rs12660421, chr6: 41,520,640) in high LD (r2 = 0.97, rs12660421-A allele is correlated with the long COVID risk allele rs9367106-C) showed a significant eQTL after multiple testing correction, increasing FOXP4 expression in the lung (P = 5.3 × 10−9, NES; expression with GA genotype compared to expression with GG, normalized to 0) = 0.56; GTEx V8 lung samples with GG genotype, n = 483, GA genotype, n = 32; https://gtexportal.org/home/snp/rs12660421). For other tissues, see multitissue eQTL plot in Supplementary Fig. 6. b, Colocalization analysis using eQTL data from GTEx v8 tissue type and long COVID GWAS meta-analysis association data (Supplementary Note). Plots illustrate −log10 P value for long COVID (x axis) and for FOXP4 expression in the lung (y axis), regional association of the FOXP4 locus variants with long COVID (top right) and regional association of the FOXP4 variants with RNA expression measured in the lung in GTEx (bottom right). Variants are colored by 1000 Genomes European-ancestry LD r2 with the lead variant (rs12660421) for FOXP4 expression in lung tissue (the most significant long COVID variant overlapping the GTEx v8 dataset (rs9381074) also annotated). c, Human Protein Atlas RNA single-cell type tissue cluster data (transcript expression levels summarized per gene and cluster) of lung (GSE130148) showing FOXP4 expression in unaffected individuals. The values were visualized using log10 (pTPM + 1) values. Each annotation is taken from the clustering results performed in the Human Protein Atlas. pTPM, protein transcripts per million.
Fig. 5
Fig. 5. Genetic correlations and MR causal estimates between long COVID and potential risk factors, biomarkers and diseases.
a,b, LD score regression (a, LDSC, top; Supplementary Table 27) and IVW MR (b, fixed-effects model,bottom; Supplementary Table 29 and Supplementary Data) were used for calculating two-sided P values. The size of each colored square corresponds to statistical significance (***P < 0.0001, full-sized square; **P < 0.01, full-sized square; *P < 0.05, full-sized square; P < 0.1, large square; P < 0.5, medium square and P > 0.5, small square; not corrected for multiple comparisons). A full list of traits is provided in Supplementary Table 26. For sample sizes in each long COVID GWAS meta-analysis using strict (S) or broad (B) case and control definitions, see Supplementary Table 11. c, MR scatter plot with effect sizes (β ± s.e.) of each variant on COVID-19 susceptibility (reported SARS-CoV-2 infection) as exposure and long COVID (strict case, broad control definition) as outcome (P (IVW, fixed effects) = 1.8 × 10−7, pleiotropy P = 0.47; Supplementary Table 30). d, Similarly, MR with COVID-19 hospitalization as exposure and long COVID as outcome (P (IVW fixed effects) = 4.8 × 10−8, pleiotropy P = 0.83; Supplementary Table 30). e, Analysis of shared and unique effects between SARS-CoV-2 infection susceptibility and long COVID using a Bayesian mixture model showed ABO and 3p21.31 rs73062389 as having shared effects (posterior probability > 0.99). FOXP4 variant association was discovered in the long COVID meta-analyses but showed also an effect on the susceptibility of the initial infection, though smaller than on long COVID (Supplementary Table 34). (Effects shown as β, error bars represent 95% confidence intervals.) f, Similarly, analysis of shared and unique effects between COVID-19 severity and long COVID using a Bayesian mixture model showed FOXP4 variant with a joint effect (posterior probability > 0.9), differing from the other severity variants due to its larger effect on long COVID (Supplementary Table 35). BMI, body mass index; CRP, C-reactive protein; eGFR, estimated glomerular filtration rate; ADHD, attention-deficit hyperactivity disorder.

References

    1. Soriano, J. B., Murthy, S., Marshall, J. C., Relan, P. & Diaz, J. V. A clinical case definition of post-COVID-19 condition by a Delphi consensus. Lancet Infect. Dis.22, e102–e107 (2022). - PMC - PubMed
    1. Desai, A. D., Lavelle, M., Boursiquot, B. C. & Wan, E. Y. Long-term complications of COVID-19. Am. J. Physiol. Cell Physiol.322, C1–C11 (2022). - PMC - PubMed
    1. Mehandru, S. & Merad, M. Pathological sequelae of long-haul COVID. Nat. Immunol.23, 194–202 (2022). - PMC - PubMed
    1. Hugon, J., Msika, E.-F., Queneau, M., Farid, K. & Paquet, C. Long COVID: cognitive complaints (brain fog) and dysfunction of the cingulate cortex. J. Neurol.269, 44–46 (2022). - PMC - PubMed
    1. Ceban, F. et al. Fatigue and cognitive impairment in post-COVID-19 syndrome: a systematic review and meta-analysis. Brain Behav. Immun.101, 93–135 (2022). - PMC - PubMed

Substances