Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan;613(7944):508-518.
doi: 10.1038/s41586-022-05473-8. Epub 2023 Jan 18.

FinnGen provides genetic insights from a well-phenotyped isolated population

Mitja I Kurki  1   2   3   4 Juha Karjalainen  1   2   3   4 Priit Palta  1   5 Timo P Sipilä  1 Kati Kristiansson  6 Kati M Donner  1 Mary P Reeve  1 Hannele Laivuori  1   7   8   9 Mervi Aavikko  1 Mari A Kaunisto  1 Anu Loukola  10 Elisa Lahtela  1 Hannele Mattsson  6 Päivi Laiho  6 Pietro Della Briotta Parolo  1 Arto A Lehisto  1 Masahiro Kanai  1   2   3   4   11 Nina Mars  1 Joel Rämö  1 Tuomo Kiiskinen  1 Henrike O Heyne  1   2   3   12   13 Kumar Veerapen  1   2   3   4 Sina Rüeger  1 Susanna Lemmelä  1   6 Wei Zhou  2   3   4 Sanni Ruotsalainen  1 Kalle Pärn  1 Tero Hiekkalinna  6 Sami Koskelainen  6 Teemu Paajanen  6 Vincent Llorens  1 Javier Gracia-Tabuenca  14 Harri Siirtola  14 Kadri Reis  5 Abdelrahman G Elnahas  5 Benjamin Sun  15   16 Christopher N Foley  17   18 Katriina Aalto-Setälä  19 Kaur Alasoo  20 Mikko Arvas  21 Kirsi Auro  22 Shameek Biswas  23 Argyro Bizaki-Vallaskangas  24 Olli Carpen  10 Chia-Yen Chen  25 Oluwaseun A Dada  1 Zhihao Ding  26 Margaret G Ehm  27 Kari Eklund  28   29 Martti Färkkilä  30 Hilary Finucane  2   3   4 Andrea Ganna  1   2   3   4 Awaisa Ghazal  1 Robert R Graham  31 Eric M Green  31 Antti Hakanen  32 Marco Hautalahti  33 Åsa K Hedman  34   35 Mikko Hiltunen  36 Reetta Hinttala  37   38   39 Iiris Hovatta  40   41 Xinli Hu  34 Adriana Huertas-Vazquez  42 Laura Huilaja  43   44 Julie Hunkapiller  45 Howard Jacob  46 Jan-Nygaard Jensen  26 Heikki Joensuu  47 Sally John  25 Valtteri Julkunen  48   49 Marc Jung  26 Juhani Junttila  50 Kai Kaarniranta  51   52 Mika Kähönen  19   53 Risto Kajanne  1 Lila Kallio  32 Reetta Kälviäinen  54   55 Jaakko Kaprio  1   56 FinnGenNurlan Kerimov  20 Johannes Kettunen  6   38   57 Elina Kilpeläinen  1 Terhi Kilpi  6 Katherine Klinger  58 Veli-Matti Kosma  59   60 Teijo Kuopio  61 Venla Kurra  62   63 Triin Laisk  5 Jari Laukkanen  61   64 Nathan Lawless  26 Aoxing Liu  1 Simonne Longerich  42 Reedik Mägi  5 Johanna Mäkelä  65 Antti Mäkitie  66   67 Anders Malarstig  68   69 Arto Mannermaa  59   60 Joseph Maranville  23 Athena Matakidou  70 Tuomo Meretoja  47 Sahar V Mozaffari  31 Mari E K Niemi  1 Marianna Niemi  19   71 Teemu Niiranen  6   72 Christopher J O Donnell  73 Ma En Obeidat  73 George Okafo  26 Hanna M Ollila  1   74 Antti Palomäki  72 Tuula Palotie  75   76 Jukka Partanen  21   77 Dirk S Paul  70 Margit Pelkonen  78 Rion K Pendergrass  45 Slavé Petrovski  70 Anne Pitkäranta  79 Adam Platt  80 David Pulford  81 Eero Punkka  10 Pirkko Pussinen  76 Neha Raghavan  42 Fedik Rahimov  46 Deepak Rajpal  58 Nicole A Renaud  73 Bridget Riley-Gillis  46 Rodosthenis Rodosthenous  1 Elmo Saarentaus  1 Aino Salminen  76 Eveliina Salminen  67   82 Veikko Salomaa  6 Johanna Schleutker  32 Raisa Serpi  50 Huei-Yi Shen  1 Richard Siegel  83 Kaisa Silander  6 Sanna Siltanen  84 Sirpa Soini  6 Hilkka Soininen  85 Jae Hoon Sul  42 Ioanna Tachmazidou  70 Kaisa Tasanen  43   44 Pentti Tienari  86   87 Sanna Toppila-Salmi  88 Taru Tukiainen  1 Tiinamaija Tuomi  1   89   90   91 Joni A Turunen  47   92 Jacob C Ulirsch  2   3 Felix Vaura  6   93 Petri Virolainen  32 Jeffrey Waring  46 Dawn Waterworth  94 Robert Yang  95 Mari Nelis  96 Anu Reigo  5 Andres Metspalu  5 Lili Milani  5 Tõnu Esko  5 Caroline Fox  42 Aki S Havulinna  1   6 Markus Perola  6 Samuli Ripatti  1 Anu Jalanko  1 Tarja Laitinen  84 Tomi P Mäkelä  97 Robert Plenge  23 Mark McCarthy  45 Heiko Runz  25 Mark J Daly  1   2   3   4 Aarno Palotie  98   99   100   101
Affiliations

FinnGen provides genetic insights from a well-phenotyped isolated population

Mitja I Kurki et al. Nature. 2023 Jan.

Erratum in

  • Author Correction: FinnGen provides genetic insights from a well-phenotyped isolated population.
    Kurki MI, Karjalainen J, Palta P, Sipilä TP, Kristiansson K, Donner KM, Reeve MP, Laivuori H, Aavikko M, Kaunisto MA, Loukola A, Lahtela E, Mattsson H, Laiho P, Della Briotta Parolo P, Lehisto AA, Kanai M, Mars N, Rämö J, Kiiskinen T, Heyne HO, Veerapen K, Rüeger S, Lemmelä S, Zhou W, Ruotsalainen S, Pärn K, Hiekkalinna T, Koskelainen S, Paajanen T, Llorens V, Gracia-Tabuenca J, Siirtola H, Reis K, Elnahas AG, Sun B, Foley CN, Aalto-Setälä K, Alasoo K, Arvas M, Auro K, Biswas S, Bizaki-Vallaskangas A, Carpen O, Chen CY, Dada OA, Ding Z, Ehm MG, Eklund K, Färkkilä M, Finucane H, Ganna A, Ghazal A, Graham RR, Green EM, Hakanen A, Hautalahti M, Hedman ÅK, Hiltunen M, Hinttala R, Hovatta I, Hu X, Huertas-Vazquez A, Huilaja L, Hunkapiller J, Jacob H, Jensen JN, Joensuu H, John S, Julkunen V, Jung M, Junttila J, Kaarniranta K, Kähönen M, Kajanne R, Kallio L, Kälviäinen R, Kaprio J; FinnGen; Kerimov N, Kettunen J, Kilpeläinen E, Kilpi T, Klinger K, Kosma VM, Kuopio T, Kurra V, Laisk T, Laukkanen J, Lawless N, Liu A, Longerich S, Mägi R, Mäkelä J, Mäkitie A, Malarstig A, Mannermaa A, Maranville J, Matakidou A, Meretoja T, Mozaffari SV, Niemi MEK, Niemi M, Niiranen T, O Donnell CJ, Obeidat… See abstract for full author list ➔ Kurki MI, et al. Nature. 2023 Mar;615(7952):E19. doi: 10.1038/s41586-023-05837-8. Nature. 2023. PMID: 36829046 Free PMC article. No abstract available.

Abstract

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

PubMed Disclaimer

Conflict of interest statement

J.C.U.: has received compensation for consulting from Goldfinch Bio and is an employee of Patch Biosciences. K.E.: Consultation fees from Sobi, and Orion corporation. A. Palomäki: consulting fee from Abbvie, Amgen and Pfizer, lecture fee from Pfizer and Sanofi. H. Jacob: employee of AbbVie. F.R.: employee of AbbVie. B.R.-G.: employee of AbbVie. J.W.: employee of AbbVie. A. Matakidou: employee of AstraZeneca. D.S.P.: employee of AstraZeneca. S.P.: employee of AstraZeneca. A. Platt: employee of AstraZeneca. I.T.: employee of AstraZeneca. B.S.: employee of Biogen. C.-Y.C.: employee of Biogen. S.J.: employee of Biogen. H.R.: employee of Biogen. Z.D.: employee of Boehringer Ingelheim. J.-N.J.: employee of Boehringer Ingelheim. M.J.: employee of Boehringer Ingelheim. N.L.: employee of Boehringer Ingelheim. G.O.: employee of Boehringer Ingelheim. S.B.: employee of Bristol Myers Squibb. J. Maranville: employee of Bristol Myers Squibb. R.P.: employee of Bristol Myers Squibb. J.H.: employee of Genentech. R.K.P.: employee of Genentech. M.M.: employee of Genentech. K. Auro: employee of GlaxoSmithKline. M.G.E.: employee of GlaxoSmithKline. D.P.: employee of GlaxoSmithKline. R.Y.: employee of Janssen Biotech. D.W.: employee of Janssen Research & Development. R.R.G.: employee of Maze Therapeutics. E.M.G.: employee of Maze Therapeutics. S.V.M.: employee of Maze Therapeutics. A.H.-V.: employee of Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA. S. Longerich: employee of Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA. N.R.: employee of Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA. J.H.S.: employee of Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA. C.F.: employee of Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA. M.E.K.N.: employee of Novartis. C.J.O.: employee of Novartis. M.O.: employee of Novartis. N.R.: employee of Novartis. R. Siegel: employee of Novartis. Å.K.H.: employee of Pfizer. X.H.: employee of Pfizer. A. Malarstig: employee of Pfizer. K. Klinger: employee of Sanofi. D.R.: employee of Sanofi. J.P.: employee of the Finnish Red Cross Blood Service. A.G.E.: employee of University of Tartu. M.J.D.: founder of Maze Therapeutics. T. Kuopio: lecture fee from Amgen, Roche and MSD. J.A.T.: lecture fee from Blueprint Genetics Finland, and on the advisory board of Novartis Finland (unrelated to this work). P.T.: lecture or consulting fee from Alexion, Roche, Merck, Janssen-Cilag, Novartis and Biogen (unrelated to this work). M. Aavikko: lecturing fee and congress trip reimbursement from Ipsen and Novartis (unrelated to this work). A. Palotie: member of the Pfizer Genetics Scientific Advisory Panel. V.S.: received a honorarium from Sanofi for consulting, and ongoing research collaboration with Bayer (all unrelated to the present study). H. Joensuu: senior consultant for Orion Pharma Orion, Chairman of the Scientific Advisory Board in Neutron Therapeutics, received a fee for acting as the Chair of an ESMO meeting from Deciphera Pharmaceuticals, stock ownership in Orion Pharma and Sartar Therapeutics. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. FinnGen sample collection and phenotyping.
a, Samples collected from different geographical areas. The map of Finland is divided into major administrative areas. Coloured regions represent the areas of the nine biobanks that provide samples to FinnGen. The Finnish Institute for Health and Welfare (THL), the Blood Service and the Terveystalo biobanks are not regional. The circle size represents relative sample sizes. The number of samples given are those used in the analyses after QC. b, National registries utilized to construct FinnGen end points. The numbers indicate the number of events in each register at the time of FinnGen release. An individual can have multiple diagnoses and can have events from multiple registers contributing to the end point of the individual. c, Sample prevalence of major disease categories in FinnGen. Major diseases for each category were chosen for demonstration purposes (Supplementary Tables 3 and 4). d, Examples of registers used for constructing four selected end points. The y axis represents individuals with matching register code in each register according to FinnGen end point definitions. Each individual can contribute only once to each register but the same individual can be counted in multiple registers. e, Comparison of effect sizes (beta values) in known genome-wide significant loci between four example FinnGen end points and large reference GWAS. The y and x axes represent FinnGen and reference GWAS beta values respectively. Beta values are aligned to be positive in reference studies. Lines extending from points indicate standard errors of beta values. Regression lines omit intercept and two types of regressions are provided: unweighted and weighted by pooled standard errors from the two studies. Solid line indicates identity line and dotted line and dashed lines indicate unweighted and weighted regression, respectively. Sample sizes used for e are given in Supplementary Table 7. Only variants with P < 1 × 10−10 in reference study were included. A comparison of all 15 diseases is provided in the Supplementary Information. Part a adapted with permission from an original biobank map created by BBMRI.fi.
Fig. 2
Fig. 2. Comparison of previously unknown and known lead variants in loci identified in the 15 studied diseases.
a, Case prevalence and counts in FinnGen, the EstBB and the UKBB. The phenotypes are sorted on the basis of FinnGen prevalence. b, Distribution of minor AFs in known (red) and new (blue) loci in the NFSEE population. c, Distribution of AF enrichment between Finland and other Northwestern European populations in gnomAD (excluding Estonia and Sweden). The x axis represents enrichment bins. d, AFs of 25 replicated genome-wide significant (in FinnGen discovery) new low-frequency (<5% in NFSEE populations) variants in FinnGen, the EstBB and the UKBB. The dotted line indicates the same variants and no line means absence of the variant in other biobanks.
Fig. 3
Fig. 3. Characteristics of unique associations in end points identified in FinnGen.
Characteristics of 493 (73 with coding variants in the credible set) specific associations in 112 (42 end points with coding variants in the credible set) end points identified in FinnGen release 5. Note that 25 of the associations with a coding variant with PIP < 0.05 in credible sets were removed from plots as ‘uncertain to contain coding variant’. a, Distribution of fine-mapping PIP values of the 73 coding variants. b, AF spectrum in associations with and without coding variants in credible sets (CS). c, Proportion of coding variants identified in different AFs (in NFSEE individuals in gnomAD). The numbers above the bars indicate the number of associations within a bin, the y axis indicates the proportion of associations with coding variants in their credible sets. d, Enrichment in Finland as a function of AF in the gnomAD NFSEE population (enrichment value for variants with AF values of 0 in NFEE individuals in gnomAD was set to maximum observed enrichment value of log2(166) = 7.38). The smoothed regression lines of local average enrichment are estimated by local polynomial fitting (loess) and the shaded areas represent 95% confidence intervals of the model fit.
Extended Data Fig. 1
Extended Data Fig. 1. FinnGen Age Distribution and Registers.
A) Distribution of the current age (age at the end of the follow-up) and age of death for FinnGen participants B) Follow-up time and main coding used in each register among FinnGen participants in FinnGen release 5. Abbreviations: CANCER = The Finnish Cancer Registry; DEATH = Cause of death register; INPATIENT = HILMO - Care Register for Health Care: Inpatient hospital visits; OUTPATIENT = HILMO - Care Register for Health Care: Specialty outpatient visits and day surgeries; PURCHASE = Drug Purchases: All Prescription drug purchases; REIMBURSEMENT = Drug Reimbursement: entitlements for prescription drug reimbursement for certain chronic diseases.
Extended Data Fig. 2
Extended Data Fig. 2. PCA classification of 224,737 FinnGen participants combined with 1000 genomes samples (AFR,AMR,EAST,EUR,FIN,SAS).
FinnGen outlier samples were removed as deviating from the bulk of the FinnGen samples.
Extended Data Fig. 3
Extended Data Fig. 3. Comparison of effect sizes between biobanks.
A,B) Effect size (log(OR), beta) comparison of 275 genome-wide significant lead variants identified in FinnGen among 15 analysed diseases in Estonia and UKBB. The sign of beta is aligned to be positive in Estonia and UKBB. C,D) beta comparison of variants only in known loci. E,F) beta comparison of novel loci. Dashed lines indicates identity line and solid lines are the regression line (red line and text weighted by pooled standard error of betas).
Extended Data Fig. 4
Extended Data Fig. 4. Enrichment of 493 unique phenome-wide significant associations binned by NFSEE MAF and split by whether 95% credible sets contain a coding variant.
The p-values of the test of difference in average enrichment are shown on the right side of each MAF bin. Lines indicate 95% confidence interval of the mean enrichment. Number of coding/non-coding variants in each bin : 21/27, 12/35, 11/22, 3/10, 7/24 and 19/277 given in the same order as in the figure x-axis.

Comment in

References

    1. Lim ET, et al. Distribution and medical impact of loss-of-function variants in the Finnish founder population. PLoS Genet. 2014;10:e1004494. doi: 10.1371/journal.pgen.1004494. - DOI - PMC - PubMed
    1. Xue Y, et al. Enrichment of low-frequency functional variants revealed by whole-genome sequencing of multiple isolated European populations. Nat. Commun. 2017;8:15927. doi: 10.1038/ncomms15927. - DOI - PMC - PubMed
    1. Zuk, O. et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl Acad. Sci. USA10.1073/pnas.1322563111 (2014). - PMC - PubMed
    1. Norio R. The Finnish Disease Heritage III: the individual diseases. Hum. Genet. 2003;112:470–526. doi: 10.1007/s00439-002-0877-1. - DOI - PubMed
    1. Auton A, et al. A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. - DOI - PMC - PubMed

Publication types