. 2019 Aug;572(7769):323-328.

doi: 10.1038/s41586-019-1457-z. Epub 2019 Jul 31.

Exome sequencing of Finnish isolates enhances rare-variant association power

Adam E Locke^#^{1

2

3}, Karyn Meltz Steinberg^#^{2

4}, Charleston W K Chiang^#^{5

6

7}, Susan K Service^#⁵, Aki S Havulinna^{8

9}, Laurel Stell¹⁰, Matti Pirinen^{8

11

12}, Haley J Abel^{2

13}, Colby C Chiang², Robert S Fulton^{2

13}, Anne U Jackson³, Chul Joo Kang², Krishna L Kanchi², Daniel C Koboldt^{2

14

15}, David E Larson^{2

13}, Joanne Nelson², Thomas J Nicholas^{2

16}, Arto Pietilä⁹, Vasily Ramensky^{5

17}, Debashree Ray^{3

18}, Laura J Scott³, Heather M Stringham³, Jagadish Vangipurapu¹⁹, Ryan Welch³, Pranav Yajnik³, Xianyong Yin³, Johan G Eriksson^{20

21

22}, Mika Ala-Korpela^{23

24

25

26

27

28}, Marjo-Riitta Järvelin^{29

30

31

32

33}, Minna Männikkö^{30

34}, Hannele Laivuori^{8

35

36}; FinnGen Project; Susan K Dutcher^{2

13}, Nathan O Stitziel^{2

37}, Richard K Wilson^{2

14

15}, Ira M Hall^{1

2}, Chiara Sabatti^{10

38}, Aarno Palotie^{8

39

40}, Veikko Salomaa⁹, Markku Laakso^{19

41}, Samuli Ripatti^{8

11

40}, Michael Boehnke⁴², Nelson B Freimer⁴³

Affiliations

¹ Department of Medicine, Washington University School of Medicine, St Louis, MO, USA.
² McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA.
³ Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA.
⁴ Department of Pediatrics, Washington University School of Medicine, St Louis, MO, USA.
⁵ Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA.
⁶ Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
⁷ Quantitative and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA.
⁸ Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland.
⁹ National Institute for Health and Welfare, Helsinki, Finland.
¹⁰ Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
¹¹ Department of Public Health, University of Helsinki, Helsinki, Finland.
¹² Helsinki Institute for Information Technology HIIT and Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.
¹³ Department of Genetics, Washington University School of Medicine, St Louis, MO, USA.
¹⁴ The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.
¹⁵ Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA.
¹⁶ USTAR Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.
¹⁷ Federal State Institution "National Medical Research Center for Preventive Medicine" of the Ministry of Healthcare of the Russian Federation, Moscow, Russia.
¹⁸ Departments of Epidemiology and Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA.
¹⁹ Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland.
²⁰ Department of Public Health Solutions, National Institute for Health and Welfare, Helsinki, Finland.
²¹ Folkhälsan Research Center, Helsinki, Finland.
²² Department of General Practice and Primary Health Care, University of Helsinki, Helsinki and Helsinki University Hospital, Helsinki, Finland.
²³ Systems Epidemiology, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.
²⁴ Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, University of Oulu, Oulu, Finland.
²⁵ NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland.
²⁶ Population Health Science, Bristol Medical School, University of Bristol, Bristol, UK.
²⁷ Medical Research Council Integrative Epidemiology Unit at the University of Bristol, Bristol, UK.
²⁸ Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, The Alfred Hospital, Monash University, Melbourne, Victoria, Australia.
²⁹ Biocenter Oulu, University of Oulu, Oulu, Finland.
³⁰ Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu, Finland.
³¹ Unit of Primary Health Care, Oulu University Hospital, Oulu, Finland.
³² Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK.
³³ Department of Life Sciences, College of Health and Life Sciences, Brunel University London, London, UK.
³⁴ Northern Finland Birth Cohorts, Faculty of Medicine, University of Oulu, Oulu, Finland.
³⁵ Medical and Clinical Genetics, University of Helsinki and Helsinki University Hospital, Helsinki, Finland.
³⁶ Department of Obstetrics and Gynecology, Tampere University Hospital and University of Tampere, Faculty of Medicine and Health Technology, Tampere, Finland.
³⁷ Cardiovascular Division, Department of Medicine, Washington University School of Medicine, St Louis, MO, USA.
³⁸ Department of Statistics, Stanford University, Stanford, CA, USA.
³⁹ Analytical and Translational Genetics Unit (ATGU), Psychiatric & Neurodevelopmental Genetics Unit, Departments of Psychiatry and Neurology, Massachusetts General Hospital, Boston, MA, USA.
⁴⁰ Broad Institute of MIT and Harvard, Cambridge, MA, USA.
⁴¹ Department of Medicine, Kuopio University Hospital, Kuopio, Finland.
⁴² Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA. boehnke@umich.edu.
⁴³ Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA. nfreimer@mednet.ucla.edu.

^# Contributed equally.

PMID: 31367044
PMCID: PMC6697530
DOI: 10.1038/s41586-019-1457-z

Exome sequencing of Finnish isolates enhances rare-variant association power

Adam E Locke et al. Nature. 2019 Aug.

. 2019 Aug;572(7769):323-328.

doi: 10.1038/s41586-019-1457-z. Epub 2019 Jul 31.

Authors

Affiliations

¹ Department of Medicine, Washington University School of Medicine, St Louis, MO, USA.
² McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA.
³ Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA.
⁴ Department of Pediatrics, Washington University School of Medicine, St Louis, MO, USA.
⁵ Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA.
⁶ Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
⁷ Quantitative and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA.
⁸ Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland.
⁹ National Institute for Health and Welfare, Helsinki, Finland.
¹⁰ Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
¹¹ Department of Public Health, University of Helsinki, Helsinki, Finland.
¹² Helsinki Institute for Information Technology HIIT and Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.
¹³ Department of Genetics, Washington University School of Medicine, St Louis, MO, USA.
¹⁴ The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.
¹⁵ Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA.
¹⁶ USTAR Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.
¹⁷ Federal State Institution "National Medical Research Center for Preventive Medicine" of the Ministry of Healthcare of the Russian Federation, Moscow, Russia.
¹⁸ Departments of Epidemiology and Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA.
¹⁹ Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland.
²⁰ Department of Public Health Solutions, National Institute for Health and Welfare, Helsinki, Finland.
²¹ Folkhälsan Research Center, Helsinki, Finland.
²² Department of General Practice and Primary Health Care, University of Helsinki, Helsinki and Helsinki University Hospital, Helsinki, Finland.
²³ Systems Epidemiology, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.
²⁴ Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, University of Oulu, Oulu, Finland.
²⁵ NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland.
²⁶ Population Health Science, Bristol Medical School, University of Bristol, Bristol, UK.
²⁷ Medical Research Council Integrative Epidemiology Unit at the University of Bristol, Bristol, UK.
²⁸ Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, The Alfred Hospital, Monash University, Melbourne, Victoria, Australia.
²⁹ Biocenter Oulu, University of Oulu, Oulu, Finland.
³⁰ Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu, Finland.
³¹ Unit of Primary Health Care, Oulu University Hospital, Oulu, Finland.
³² Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK.
³³ Department of Life Sciences, College of Health and Life Sciences, Brunel University London, London, UK.
³⁴ Northern Finland Birth Cohorts, Faculty of Medicine, University of Oulu, Oulu, Finland.
³⁵ Medical and Clinical Genetics, University of Helsinki and Helsinki University Hospital, Helsinki, Finland.
³⁶ Department of Obstetrics and Gynecology, Tampere University Hospital and University of Tampere, Faculty of Medicine and Health Technology, Tampere, Finland.
³⁷ Cardiovascular Division, Department of Medicine, Washington University School of Medicine, St Louis, MO, USA.
³⁸ Department of Statistics, Stanford University, Stanford, CA, USA.
³⁹ Analytical and Translational Genetics Unit (ATGU), Psychiatric & Neurodevelopmental Genetics Unit, Departments of Psychiatry and Neurology, Massachusetts General Hospital, Boston, MA, USA.
⁴⁰ Broad Institute of MIT and Harvard, Cambridge, MA, USA.
⁴¹ Department of Medicine, Kuopio University Hospital, Kuopio, Finland.
⁴² Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA. boehnke@umich.edu.
⁴³ Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA. nfreimer@mednet.ucla.edu.

^# Contributed equally.

PMID: 31367044
PMCID: PMC6697530
DOI: 10.1038/s41586-019-1457-z

Erratum in

Author Correction: Exome sequencing of Finnish isolates enhances rare-variant association power.
Locke AE, Steinberg KM, Chiang CWK, Service SK, Havulinna AS, Stell L, Pirinen M, Abel HJ, Chiang CC, Fulton RS, Jackson AU, Kang CJ, Kanchi KL, Koboldt DC, Larson DE, Nelson J, Nicholas TJ, Pietilä A, Ramensky V, Ray D, Scott LJ, Stringham HM, Vangipurapu J, Welch R, Yajnik P, Yin X, Eriksson JG, Ala-Korpela M, Järvelin MR, Männikkö M, Laivuori H; FinnGen Project; Dutcher SK, Stitziel NO, Wilson RK, Hall IM, Sabatti C, Palotie A, Salomaa V, Laakso M, Ripatti S, Boehnke M, Freimer NB. Locke AE, et al. Nature. 2019 Nov;575(7783):E4. doi: 10.1038/s41586-019-1726-x. Nature. 2019. PMID: 31686056

Abstract

Exome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits. Exome-wide association studies for 64 quantitative traits identified 26 newly associated deleterious alleles. Of these 26 alleles, 19 are either unique to or more than 20 times more frequent in Finnish individuals than in other Europeans and show geographical clustering comparable to Mendelian disease mutations that are characteristic of the Finnish population. We estimate that sequencing studies of populations without this unique history would require hundreds of thousands to millions of participants to achieve comparable association power.

PubMed Disclaimer

Conflict of interest statement

Competing interests statements:

VS has participated in a conference trip sponsored by Novo Nordisk and received a honorarium from the same source for participating in an advisory board meeting. He also has ongoing research collaboration with Bayer Ltd.

HL is a member of the Nordic Expert group unconditionally supported by Gedeon Richter Nordics and has received an honorarium from Orion.

Figures

**Extended Data Fig. 1. Allele frequency comparisons between FinMetSeq and NFE from gnomAD.**
A) Distribution of allelic frequencies between FinMetSeq and gnomAD NFE. The comparison of allele frequencies shows the excess of variants at higher frequency in Finland as a result of the multiple bottlenecks experienced in Finnish population history. B) Proportional site frequency spectra between FinMetSeq and gnomAD NFE by variant annotation class. In general, we find a depletion of the variants in the rarest frequency class, as well as enrichment of variants in the intermediate to common frequency range. The site frequency spectra were down-sampled to 18,000 chromosomes for each dataset. C) Comparison of MAFs for trait-associated variants in FinMetSeq and NFE gnomAD. Plotted in gray background is a 2-D histogram of variants with non-zero allele frequencies in both gnomAD and FinMetSeq but no trait associations. Variants associated with at least one trait are colored and scaled inversely proportional to the logarithm of the association p-value. Variants >10x enriched in FinMetSeq compared to NFE are pink, those <10x enriched are in blue. The dashed line is the line of equal frequency. Two-sided uncorrected P-values are from a regression of trait on the count of alternative allele at each variant. The number of independent individuals used in each point is listed in Supplementary Table 5.

**Extended Data Figure 2. Heritability of and correlations between traits.**
Traits are in the same order, clockwise in A, and left to right and top to bottom in B, following the trait group color key. A) Heritability estimated in 13,342 unrelated individuals (for abbreviations see Supplementary Table 4), for details see Supplementary Table 6. B) Heatmap of: 1) absolute Pearson correlations of standardized trait values in upper triangle; 2) absolute values of estimated pairwise genetic correlations in lower triangle. Genetic correlations are estimated in 13,342 unrelated individuals. Values below the diagonal in gray had trait heritability less than 1.5 times the SE of heritability.

**Extended Data Fig. 3. Properties of associations shared between traits.**
A) Shared genomic associations by pairs of traits. For traits x and y, color in row x and column y reflects the number of loci associated with both traits divided by the number of loci associated with trait x. Traits are presented in the same order as in Extended Data Figure 2A, and the side and top color bars reflect trait groups. B) Relationship between estimated genetic correlation and extent of sharing of genetic associations. For each trait-pair, the extent of locus sharing is defined as the number of loci associated with both traits divided by the total number of loci associated with either trait. Analysis using the absolute value of the Pearson correlation of the residual series results in a very similar pattern. The number of trait pairs in each x-axis category are as follows: 0-1%: 819; 1-10%: 204, 11-20%: 102; 21-30%: 41; 31-40%: 29; 41-50%: 16, >50%: 13. The bar within each box is the median, the box represents the upper and lower quartiles, whiskers extend to 1.5x the interquartile range, and points represent outliers.

**Extended Data Fig. 4. Gene-based association of extremely rare variants in *APOB* with serum total cholesterol.**
The upper panel shows the distribution of the covariate adjusted and inverse-normal transformed phenotype. The lower panel displays the association statistics for each variant included in the gene-based test along with the trait value for minor allele carriers of each variant (orange triangles). SV.P is the P-value from the analysis of each variant in a single-variant analysis. The number of independent individuals in the analysis is 19,291.

**Extended Data Fig. 5. Gene-based association of rare variants in *SECTM1* with HDL2 cholesterol.**
The upper panel shows the distribution of the covariate adjusted and inverse-normal transformed phenotype. The lower panel displays the association statistics for each variant included in the gene-based test, along with the trait value for minor allele carriers of each variant (orange triangles). SV.P is the P-value from the analysis of each variant in a single-variant analysis. The number of independent individuals in the analysis is 10,984.

**Extended Data Fig. 6. Gene-based association of extremely rare variants in *ALDH1L1* with glycine levels.**
The upper panel shows the distribution of the covariate adjusted and inverse-normal transformed phenotype. The lower panel displays the association statistics for each variant included in the gene-based test, along with the trait value for minor allele carriers of each variant (orange triangles). SV.P is the P-value from the analysis of each variant in a single-variant analysis. The number of independent individuals in the analysis is 8,206.

**Extended Data Fig. 7. Population structure of the FinMetSeq dataset, by region.**
Population structure, by region, from principal components analysis of exome sequencing variant data (MAF > 1%), for 14,874 unrelated individuals known parental birthplaces. Color indicates individuals with both parents born in the same region; gray indicates individuals with different parental birth regions, or missing information for one parent. Abbreviations for the regions: Usm, Uusimaa; Swf, Southwest Finland; Stk, Satakunta; Khm, Kanta-Hame; Prk, Pirkanmaa; Phm, Paijat-Hame; Kyl, Kymenlaakso; SKa, Southern Karelia; Nka, Northern Karelia; SSv, Southern Savonia; NSv, Northern Savonia; Ctf, Central Finland; SOs, Southern Ostrobothnia; Osb, Ostrobothnia; COs, Central Ostrobothnia; NOs, Northern Ostrobothnia; Kai, Kainuu; Lap, Lapland; X, split parental birthplaces. Large solid circles represent the center of each region.

**Extended Data Fig. 8. Hierarchical clustering tree produced by fineSTRUCTURE.**
We identified 16 subpopulations within the FinMetSeq dataset by applying a haplotype-based clustering algorithm, fineSTRUCTURE, on 2,644 unrelated individuals born by 1955 whose parents were both born in the same municipality (Methods). Each subpopulation is named based on the most common parental birth location among its members, with the following abbreviations: NKa, North Karelia; NSv, North Savonia; SOs, South Ostrobothnia; NOs, North Ostrobothnia; Kai, Kainuu; Lap, Lapland; SuK, Surrendered Karelia. A map of Finland with regions labeled is supplied for reference. If multiple subpopulations share the same location label, the subpopulation is further distinguished with a numeral. NSv3 is used as an internal reference in enrichment analysis. See Supplementary Table 17 for more detailed demographic descriptions of each subpopulation.

**Extended Data Fig. 9. Regional variation in allele frequencies by functional annotation.**
Enrichment of variants by allelic class in regional sub-populations of late settlement Finland (defined in Supplementary Table 17). Each bin represents the ratio of variants in the subpopulation compared to the reference subpopulation (NSv3), after down-sampling the frequency spectra of all populations to 200 chromosomes. Pink cells represent an enrichment (ratio >1), blue cells represent a depletion (ratio <1). Sample sizes and confidence intervals on each enrichment ratios, and their P-values, are presented in Supplementary Table 18. The results are consistent with multiple bottlenecks in late settlement Finland, particularly for populations in Lapland and Northern Ostrobothnia.

**Figure 1. Characterization of associations.**
A) Number of genomic loci associated with each trait. Bars are subdivided into common (MAF>1%, dark blue) and rare (MAF≤1%, light blue). B) Relationship between estimated heritability and number of loci detected per trait. Each trait is colored by trait group. Vertical bars indicate ±2 standard errors. The gray line shows the linear regression fit to indicate the general trend. The number of independent individuals used in each point is listed in Supplementary Table 5. Height is the notable outlier.

**Figure 2. Allelic enrichment in the Finnish population and its effect on genetic discovery.**
A) Relationship between MAF and estimated effect size for associations discovered in FinMetSeq. Each variant reaching significance in FinMetSeq is plotted, with associations in Table 1 represented by dark blue points (FinMetSeq MAF) and green points (NFE MAF). Purple lines indicate 80% power curves for sample sizes of 10,000 and 20,000 at α=5x10^-7. B) Same plot as in A, highlighting the variants in Table 1 only reaching significance in the combined analysis.

**Figure 3. Geographical clustering of associated variants.**
A) Example of geographical clustering for a novel trait-associated variant (Table 1). The map shows birth locations of all 113 parents of carriers (orange) and 113 randomly selected parents of non-carriers (blue) of the minor allele for rs780671030 in *ALDH1L1*. B) FDH mutations (N=38) geographically cluster (by parental birthplace) similarly to trait-associated variants (Table 1) that are >10x more frequent in FMS than in NFE (N=12) and more than enriched variants from our combined analysis (N=7). For all variants, carriers clustered more than non-carriers (center line, median; box limits, upper and lower quartiles; whiskers, 1.5 interquartile range; points, outliers).

See this image and copyright information in PMC

References

1. Samocha KE, et al. Regional missense constraint improves variant deleteriousness prediction. bioRxiv. 2017 doi: 10.1101/148353. - DOI
1. Marouli E, et al. Rare and low-frequency coding variants alter human adult height. Nature. 2017;542:186–190. doi: 10.1038/nature21039. - DOI - PMC - PubMed
1. Flannick J, et al. Exome sequencing of 20,791 cases of type 2 diabetes and 24,440 controls. Nature. 2019;570:71–76. doi: 10.1038/s41586-019-1231-2. - DOI - PMC - PubMed
1. Timpson NJ, Greenwood CMT, Soranzo N, Lawson DJ, Richards JB. Genetic architecture: the shape of the genetic contribution to human traits and disease. Nature reviews. Genetics. 2018;19:110–124. doi: 10.1038/nrg.2017.101. - DOI - PubMed
1. Zuk O, et al. Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci U S A. 2014;111:E455–464. doi: 10.1073/pnas.1322563111. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Exome sequencing of Finnish isolates enhances rare-variant association power

Affiliations

Exome sequencing of Finnish isolates enhances rare-variant association power

Authors

Affiliations

Erratum in

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources