Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 21;23(1):663.
doi: 10.1186/s12864-022-08884-z.

Predicting nicotine metabolism across ancestries using genotypes

Affiliations

Predicting nicotine metabolism across ancestries using genotypes

James W Baurley et al. BMC Genomics. .

Abstract

Background: There is a need to match characteristics of tobacco users with cessation treatments and risks of tobacco attributable diseases such as lung cancer. The rate in which the body metabolizes nicotine has proven an important predictor of these outcomes. Nicotine metabolism is primarily catalyzed by the enzyme cytochrone P450 (CYP2A6) and CYP2A6 activity can be measured as the ratio of two nicotine metabolites: trans-3'-hydroxycotinine to cotinine (NMR). Measurements of these metabolites are only possible in current tobacco users and vary by biofluid source, timing of collection, and protocols; unfortunately, this has limited their use in clinical practice. The NMR depends highly on genetic variation near CYP2A6 on chromosome 19 as well as ancestry, environmental, and other genetic factors. Thus, we aimed to develop prediction models of nicotine metabolism using genotypes and basic individual characteristics (age, gender, height, and weight).

Results: We identified four multiethnic studies with nicotine metabolites and DNA samples. We constructed a 263 marker panel from filtering genome-wide association scans of the NMR in each study. We then applied seven machine learning techniques to train models of nicotine metabolism on the largest and most ancestrally diverse dataset (N=2239). The models were then validated using the other three studies (total N=1415). Using cross-validation, we found the correlations between the observed and predicted NMR ranged from 0.69 to 0.97 depending on the model. When predictions were averaged in an ensemble model, the correlation was 0.81. The ensemble model generalizes well in the validation studies across ancestries, despite differences in the measurements of NMR between studies, with correlations of: 0.52 for African ancestry, 0.61 for Asian ancestry, and 0.46 for European ancestry. The most influential predictors of NMR identified in more than two models were rs56113850, rs11878604, and 21 other genetic variants near CYP2A6 as well as age and ancestry.

Conclusions: We have developed an ensemble of seven models for predicting the NMR across ancestries from genotypes and age, gender and BMI. These models were validated using three datasets and associate with nicotine dosages. The knowledge of how an individual metabolizes nicotine could be used to help select the optimal path to reducing or quitting tobacco use, as well as, evaluating risks of tobacco use.

Keywords: Machine learning; Nicotine biomarkers; Nicotine metabolism; Polygenic risk score; Smoking cessation; Statistical learning.

PubMed Disclaimer

Conflict of interest statement

JWB and CME are members and employees of BioRealm LLC. AWB is an employee of Oregon Research Institute and ORI Community and Evaluation Services, and serves as a Scientific Advisor and Consultant to BioRealm LLC. JWB, CSM and AWB are co-inventors on a related patent application “Biosignature Discovery for Substance Use Disorder Using Statistical Learning”, assigned to BioRealm LLC. BioRealm LLC offers genotyping and data analysis services. Other authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Assessment of the seven models in the training data (MEC). Models were trained using project pursuit (PPR), partial least squares (PLS), support vector machine with a linear kernel (SVM_lin), elastic net (GLMNET), random forests (RF), support vector machine with a radial basis function kernel (SVM_rad_sig), and gradient boosting machine (GBM). Model performances were assessed using mean absolute error (MAE), root mean squared error (RMSE), and R Squared. The boxplots summarizes these metrics across 100 cross validation datasets. Performances were similar across the models justifying use of an average of predictions in the ensemble model
Fig. 2
Fig. 2
Observed versus predicted NMR values for the training (MEC) and validation (CENIC, HSS, and METS) data. The predicted NMR is the averages of the predictions from the seven models (i.e., the ensemble model). The correlation between these values are displayed to the upper left of each scatterplot. The distribution of NMRs were different across studies, yet the correlations were still strong in the validation datasets
Fig. 3
Fig. 3
Summary of the most important candidate variables in the NMR models. Variable importance was ranked for each of the seven models trained on the MEC data. The number of times each variable occurred in the top 20 for each model was enumerated. Asian ancestry and rs56113850 was an influential variable in all the models

References

    1. Benowitz NL, Hukkanen J, Jacob P., 3rd Nicotine chemistry, metabolism, kinetics and biomarkers. Handb Exp Pharmacol. 2009;192:29–60. doi: 10.1007/978-3-540-69248-5_2. - DOI - PMC - PubMed
    1. Hukkanen J, Jacob P, 3rd, Benowitz NL. Metabolism and disposition kinetics of nicotine. Pharmacol Rev. 2005;57(1):79–115. doi: 10.1124/pr.57.1.3. - DOI - PubMed
    1. Murphy SE. Biochemistry of nicotine metabolism and its relevance to lung cancer. J Biol Chem. 2021;296:100722. doi: 10.1016/j.jbc.2021.100722. - DOI - PMC - PubMed
    1. Roberts W, Marotta PL, Verplaetse TL, Peltier MR, Burke C, Ramchandani VA, et al. A prospective study of the association between rate of nicotine metabolism and alcohol use in tobacco users in the United States. Drug Alcohol Depend. 2020;216:108210. doi: 10.1016/j.drugalcdep.2020.108210. - DOI - PMC - PubMed
    1. Lerman C, Schnoll RA, Hawk LW, Jr, Cinciripini P, George TP, Wileyto EP, et al. Use of the nicotine metabolite ratio as a genetically informed biomarker of response to nicotine patch or varenicline for smoking cessation: a randomised, double-blind placebo-controlled trial. Lancet Respir Med. 2015;3(2):131–8. doi: 10.1016/S2213-2600(14)70294-2. - DOI - PMC - PubMed

LinkOut - more resources