Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 8;1(3):100066.
doi: 10.1016/j.xgen.2021.100066.

Machine learning enables new insights into genetic contributions to liver fat accumulation

Affiliations

Machine learning enables new insights into genetic contributions to liver fat accumulation

Mary E Haas et al. Cell Genom. .

Abstract

Excess liver fat, called hepatic steatosis, is a leading risk factor for end-stage liver disease and cardiometabolic diseases but often remains undiagnosed in clinical practice because of the need for direct imaging assessments. We developed an abdominal MRI-based machine-learning algorithm to accurately estimate liver fat (correlation coefficients, 0.97-0.99) from a truth dataset of 4,511 middle-aged UK Biobank participants, enabling quantification in 32,192 additional individuals. 17% of participants had predicted liver fat levels indicative of steatosis, and liver fat could not have been reliably estimated based on clinical factors such as BMI. A genome-wide association study of common genetic variants and liver fat replicated three known associations and identified five newly associated variants in or near the MTARC1, ADH1B, TRIB1, GPAM, and MAST3 genes (p < 3 × 10-8). A polygenic score integrating these eight genetic variants was strongly associated with future risk of chronic liver disease (hazard ratio > 1.32 per SD score, p < 9 × 10-17). Rare inactivating variants in the APOB or MTTP genes were identified in 0.8% of individuals with steatosis and conferred more than 6-fold risk (p < 2 × 10-5), highlighting a molecular subtype of hepatic steatosis characterized by defective secretion of apolipoprotein B-containing lipoproteins. We demonstrate that our imaging-based machine-learning model accurately estimates liver fat and may be useful in epidemiological and genetic studies of hepatic steatosis.

PubMed Disclaimer

Conflict of interest statement

M.E.H. is currently an employee and shareholder of Regeneron Pharmaceuticals. J.P.P. has served as a consultant for Maze Therapeutics. R.L. serves as a consultant or advisory board member for Arrowhead Pharmaceuticals; AstraZeneca; Boehringer-Ingelheim; Bristol Myers Squibb; Celgene; Cirius; CohBar; Galmed; Gemphire; Gilead; Glympse bio; Intercept; Ionis; Inipharma; Merck; Metacrine, Inc.; NGM Biopharmaceuticals; Novo Nordisk; Pfizer; and Viking Therapeutics. In addition, his institution has received grant support from Allergan, Boehringer-Ingelheim, Bristol Myers Squibb, Eli Lilly and Company, Galmed Pharmaceuticals, Genfit, Gilead, Intercept, Janssen, Madrigal Pharmaceuticals, NGM Biopharmaceuticals, Novartis, Pfizer, pH Pharma, and Siemens. He is also co-founder of Liponexus, Inc. A.Y.Z. is an employee of Color Health. J.R.H. was an employee of Color Health and is currently an employee of Maze Therapeutics. K.E.C. serves on the advisory boards of Novo Nordisk and BMS, has consulted for Gilead, and has received grant funding from BMS, Boehringer-Ingelheim, and Novartis. T.G.S. has served as a consultant for Aetion. A.P. is employed as a Venture Partner at GV, a venture capital group within Alphabet; he is also supported by a grant from Bayer AG to the Broad Institute, focused on machine learning for clinical trial design. S.N.F. and P.B. are supported by grants from Bayer AG and IBM applying machine learning in cardiovascular disease. P.B. has served as a consultant to Novartis. P.T.E. is supported by a grant from Bayer AG to the Broad Institute, focused on the genetics and therapeutics of cardiovascular diseases. P.T.E. has also served on advisory boards or consulted for Bayer AG, Quest Diagnostics, MyoKardia, and Novartis. A.V.K. has served as a scientific advisor to Sanofi, Amgen, Maze Therapeutics, Navitor Pharmaceuticals, Sarepta Therapeutics, Verve Therapeutics, Veritas International, Color Health, Third Rock Ventures, and Columbia University (NIH); received speaking fees from Illumina, MedGenome, Amgen, and the Novartis Institute for Biomedical Research; and received sponsored research agreements from the Novartis Institute for Biomedical Research and IBM Research.

Figures

None
Graphical abstract
Figure 1
Figure 1
Machine learning enables liver fat quantification and clinical and genetic analyses We developed a machine-learning model using a training set of 4,511 individuals with previously quantified liver fat from the UK Biobank. We applied this to estimate liver fat in an additional 32,192 individuals in the UK Biobank. Of the 36,703 total individuals with liver fat quantified, 17% met criteria for hepatic steatosis, defined as liver fat content greater than 5.5%. 1.6% of individuals had liver fat greater than 20% (not shown in the density plot). A common variant GWAS identified eight loci associated at genome-wide significance (p < 5.0 × 10−8), of which five are newly identified relative to previous studies (top Manhattan plot). None of these newly associated variants were identified in a common variant association study of those with liver fat quantified previously (bottom Manhattan plot). An RVAS identified inactivating variants in APOB and MTTP significantly (p < 1.2 × 10−5) associated with liver fat and steatosis.
Figure 2
Figure 2
Associations of clinical parameters with liver fat and hepatic steatosis in 36,703 individuals (A–C) The distribution of liver fat and prevalence of hepatic steatosis according to the presence of (A) electronic health record diagnosis of nonalcoholic fatty liver disease (NAFLD), (B) high waist-to-hip ratio, and (C) clinical categories of obesity. Hepatic steatosis was defined as liver fat greater than 5.5%. High waist-to-hip ratio was defined at time of imaging as greater than 0.9 when male and greater than 0.85 when female. Weight categories were defined using BMI at time of imaging: underweight, BMI < 18.5 kg/m2; normal, 18.5 ≤ BMI < 25 kg/m2; overweight, 25 ≤ BMI < 30 kg/m2; obese, 30 ≤ BMI < 40 kg/m2; severely obese, BMI ≥ 40 kg/m2. For boxplots, boxes indicate interquartile range (IQR; 25th–75th percentiles), and whiskers indicate distances of 1.5 IQRs from box limits. For bar plots, error bars indicate upper bounds of 95% CI.
Figure 3
Figure 3
Common variant GWAS of liver fat in 32,974 individuals identifies eight loci Associations of 9.8 million common (alternate allele frequency > 1%) genetic variants with inverse normal transformed liver fat, quantified from MRI data using machine learning, in 32,974 individuals from the UK Biobank were assessed using linear mixed models. Results of each variant association are shown with chromosome and base pair position of the variant on the x axis and −log10(p value) of the association with liver fat on the y axis. The lead variants at each of 8 genome-wide significant loci are indicated by orange points. A gray line indicates the genome-wide significance threshold (p = 5 × 10−8). See also Figure S4.
Figure 4
Figure 4
Polygenic score comprised of eight common genetic variants associated with risk of liver disease A single polygenic score for each individual was calculated by additively combining the 8 common lead genome-wide association study (GWAS) variants identified in Figure 3 via the number of liver-fat-increasing variants present in each individual, each weighted by their GWAS effect size estimate. (A) Associations between the polygenic score and incident disease occurrence after UK Biobank enrollment were assessed using a Cox proportional hazards model in 361,852 individuals who were not included in the discovery GWAS of imaging data and who did not have prevalent liver disease at time of enrollment, adjusting for age at enrollment, age at enrollment squared, sex, the first 10 principal components of genetic variation, and genotyping array. Hazard ratios (HRs) of incident disease per SD increase in the polygenic score are shown; error bars represent 95% confidence interval (CI). (B) Rates of incident disease in each decile of the polygenic score are shown; error bars represent 95% CI. See also Table S10.
Figure 5
Figure 5
RVAS of liver fat in 18,013 individuals 18,013 unrelated individuals with exome sequencing data and liver fat estimation available were grouped according to whether they carried any rare variant predicted to inactivate a given gene (carriers) or not (noncarriers). Rare inactivating variants were defined as predicted to cause premature truncation of a protein (nonsense), insertions or deletions that scramble protein translation (frameshift), or disruption of the messenger RNA splicing process (splice site) with an alternate allele frequency of less than 0.1%. Association of carrier status for each gene with inverse normal transformed liver fat, quantified from MRI data using machine learning, was assessed using linear regression. Genes with fewer than 10 inactivating variant carriers were excluded to increase the likelihood of having sufficient statistical power to detect an effect, resulting in 4,156 genes in the analysis and a significance threshold of p = 1.2 × 10−5 (0.05/4,156 genes tested). (A) Quantile-quantile (QQ) plot with expected p values of each gene from a uniform distribution are shown on the x axis and corresponding observed p values of each gene on the y axis. A gray line indicates the significance threshold (observed p = 1.20 × 10−5). (B) Liver fat distribution in carriers and noncarriers of inactivating variants in the APOB or MTTP genes. A gray line indicates the hepatic steatosis threshold (liver fat = 5.5%). (C) Prevalence of hepatic steatosis in carriers and noncarriers of inactivating variants in APOB or MTTP. For boxplots, boxes indicate IQR (25th–75th percentiles), and whiskers indicate distances of 1.5 IQRs from box limits. For bar plots, error bars indicate upper bounds of 95% CI. See also Tables S11 and S12.

References

    1. Allen A.M., Therneau T.M., Larson J.J., Coward A., Somers V.K., Kamath P.S. Nonalcoholic fatty liver disease incidence and impact on metabolic burden and death: A 20 year-community study. Hepatology. 2018;67:1726–1736. - PMC - PubMed
    1. Caussy C., Reeder S.B., Sirlin C.B., Loomba R. Noninvasive, Quantitative Assessment of Liver Fat by MRI-PDFF as an Endpoint in NASH Trials. Hepatology. 2018;68:763–772. - PMC - PubMed
    1. Loomba R., Friedman S.L., Shulman G.I. Mechanisms and disease consequences of nonalcoholic fatty liver disease. Cell. 2021;184:2537–2564. - PubMed
    1. Speliotes E.K., Massaro J.M., Hoffmann U., Vasan R.S., Meigs J.B., Sahani D.V., Hirschhorn J.N., O’Donnell C.J., Fox C.S. Fatty liver is associated with dyslipidemia and dysglycemia independent of visceral fat: the Framingham Heart Study. Hepatology. 2010;51:1979–1987. - PMC - PubMed
    1. Loomba R., Sanyal A.J. The global NAFLD epidemic. Nat. Rev. Gastroenterol. Hepatol. 2013;10:686–690. - PubMed

Grants and funding