Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul;159(7):1366-1381.
doi: 10.1097/j.pain.0000000000001222.

Machine-learned analysis of the association of next-generation sequencing-based human TRPV1 and TRPA1 genotypes with the sensitivity to heat stimuli and topically applied capsaicin

Affiliations

Machine-learned analysis of the association of next-generation sequencing-based human TRPV1 and TRPA1 genotypes with the sensitivity to heat stimuli and topically applied capsaicin

Dario Kringel et al. Pain. 2018 Jul.

Abstract

Heat pain and its modulation by capsaicin varies among subjects in experimental and clinical settings. A plausible cause is a genetic component, of which TRPV1 ion channels, by their response to both heat and capsaicin, are primary candidates. However, TRPA1 channels can heterodimerize with TRPV1 channels and carry genetic variants reported to modulate heat pain sensitivity. To address the role of these candidate genes in capsaicin-induced hypersensitization to heat, pain thresholds acquired before and after topical application of capsaicin and TRPA1/TRPV1 exomic sequences derived by next-generation sequencing were assessed in n = 75 healthy volunteers and the genetic information comprised 278 loci. Gaussian mixture modeling indicated 2 phenotype groups with high or low capsaicin-induced hypersensitization to heat. Unsupervised machine learning implemented as swarm-based clustering hinted at differences in the genetic pattern between these phenotype groups. Several methods of supervised machine learning implemented as random forests, adaptive boosting, k-nearest neighbors, naive Bayes, support vector machines, and for comparison, binary logistic regression predicted the phenotype group association consistently better when based on the observed genotypes than when using a random permutation of the exomic sequences. Of note, TRPA1 variants were more important for correct phenotype group association than TRPV1 variants. This indicates a role of the TRPA1 and TRPV1 next-generation sequencing-based genetic pattern in the modulation of the individual response to heat-related pain phenotypes. When considering earlier evidence that topical capsaicin can induce neuropathy-like quantitative sensory testing patterns in healthy subjects, implications for future analgesic treatments with transient receptor potential inhibitors arise.

PubMed Disclaimer

Conflict of interest statement

Sponsorships or competing interests that may be relevant to content are disclosed at the end of this article.

Figures

Figure 1.
Figure 1.
Patterns of the TRPA1 (chromosome 8: X8) and TRPV1 (chromosome 17: X17) genotypes observed in n = 75 healthy volunteers of Caucasian ethnicity for whom phenotype data of the heat hypersensitization after capsaicin application were available. The heat plot shows the occurrence of variants (columns) per subject (lines). The genetic information is color coded as the number of nonreference alleles found at the respective locus in the respective sample as white, 0 nonreference alleles = wild type genotype; green, heterozygous; and blue, 2 nonreference alleles. Thus, the individual genotypes are given by the vectors (rows) associated with each subject (subjects count at the right of each panel). The bar plot at the left shows the phenotype group association, with gray indicating Gaussian #1 and black indicating Gaussian #2 in Figure 2. The original genotype information (left) was permuted to obtain a negative control data set for the association of genotypes with phenotypes, and sorted in descending order of alleles at each gene locus to obtain a positive control data set for the genotype–phenotype association. The figure has been created with the R software package (version 3.4.1 for Linux; http://CRAN.R-project.org/) using the library “gplots” (Warnes et al., https://cran.r-project.org/package=gplots).
Figure 2.
Figure 2.
Original heat pain thresholds (HPTs) and distribution of the effects of capsaicin. Top: One-dimensional scatter plot of the observed individual heat pain sensitivity (dots; raw data). At the upper half (green dots), the values acquired at baseline are shown, whereas at the lower half, the values acquired after topical application of capsaicin are shown (blue dots). Bottom: The distribution of the capsaicin effects, obtained from the z-transformed HPTs according to the QST standard procedure as formula image and shown as probability density function (PDF) estimated by means of the Pareto density estimation (PDE; black line) overlaid on a histogram could be fitted using a Gaussian mixture model (GMM) given asformula image, with M = 2 modes. The fit is shown as a red line and the M = 2 mixes are indicated as differently colored dashed lines (G #1–#2). The Bayesian boundary between the Gaussians is indicated as a perpendicular magenta line. At the right side, a quantile–quantile (QQ) plot is shown comparing the observed distribution of cold pain data (ordinate) with the distribution expected from the GMM (abscissa). The blue dots symbolize the quantiles of observed data vs predicted data and the red line indicates identity, ie, the agreement between the data distribution expected from the model with the observed data distribution. The close vicinity of the dots to this line indicates satisfactory fits of the data by the respective GMM. The figure has been created using the R software package (version 3.4.1 for Linux; http://CRAN.R-project.org/); in particular, the dot plot was drawn using the R library “beeswarm” (Eklund A, https://cran.r-project.org/package=beeswarm) and the GMM plots were obtained using our package “AdaptGauss” (https://cran.r-project.org/package=AdaptGauss). QST, quantitative sensory testing.
Figure 3.
Figure 3.
Dot plot of the results of the χ2-based genotype association tests for d = 278 loci at the TRPA1 (left panel) and TRPV1 (right panel) genes. The α levels before (red) and after (blue) correction for multiple testing according to Bonferroni are indicated as horizontal lines. A distribution differing between phenotypes above the uncorrected α level was observed for the variants X8.72934391.SNV and X8.72969263.SNV (Table 2). The figure has been created using the R software package (version 3.4.1 for Linux; http://CRAN.R-project.org/) and the package “qqman” (https://cran.r-project.org/package=qqman).
Figure 4.
Figure 4.
Data structure found in the TRPA1/TRPV1 NGS genotypes and its relation with the phenotypes. Top: Visualization of high-dimensional data consisting of d = 31 gene loci analyzed in n = 75 subjects. The data were projected onto a 2-dimensional grid using a parameter-free projection polar swarm, Pswarm. During the learning phase, the DataBots were allowed for adaptively adjusting their location on the grid close to DataBots carrying data with similar features, with successively decreasing search radius. When the algorithm ends, the DataBots become projected points. To enhance the emergence of data structures on this projection, a generalized U matrix displaying the distance in the high-dimensional space was added as a third dimension to this visualization. The U matrix was colored in hypsometric colors making the visualization appear as a geographical map with brown (up to snow-covered) heights and green valleys with blue lakes. Watersheds indicate borderlines between different groups of subjects according to the pattern of repeated cold pain measurements. The data points are colored according to the emerging 2-cluster structure. Bottom left: Ward clustering of the projected data clearly indicated 2 clusters using the Manhattan distance. Bottom center: Heat plot of the pattern of genetic variants (columns) per subject (lines), grouped for the data structure of the genetic information. The 75 × 31 matrix is a visualization of high-dimensional data consisting of d = 31 gene loci analyzed in n = 75 subjects. The allele occurrences are shown color coded as the number of nonreference alleles found at the respective locus in the respective sample as white, 0 nonreference alleles = wild type genotype; green, heterozygous; and blue, 2 non-reference alleles. Bottom right: Subjects belonging to the different genotype clusters were unevenly distributed across the phenotype clusters, ie, assignment to the 2 Gaussian modes in the distribution of capsaicin effects (Fig. 2), at a statistical significance level of P < 0.05 (the Fisher exact test). The mosaic plots represent the contingency table of the genotype vs phenotype class structure (membership sizes given as numbers in the fields of the mosaic). The figure has been created using the R software package (version 3.4.1 for Linux; http://CRAN.R-project.org/), in particular the libraries “DatabionicSwarm” (M. Thrun, https://cran.r-project.org/package=DatabionicSwarm) and “gplots” (Warnes G et al., https://cran.r-project.org/package=gplots). NGS, next-generation sequencing.
Figure 5.
Figure 5.
Radar plot of the balanced accuracy of different classifiers (random forests, adaptive boosting, k-nearest neighbors, naive Bayes, support vector machines, and logistic regression) to detect of a membership to the group with high response to capsaicin-induced hypersensitization against heat pain stimuli (Gaussian #2 in Fig. 2). The classification performance has been assessed in 1000 model runs using Monte-Carlo resampling runs with splits into 2/3 of the data (new training data) and 1/3 (new test data). The performance measures are comparatively shown for the results obtained on the original TRPV1/TRPA1 NGS genotype and capsaicin sensitivity phenotype classes data set, on data constructed to provide as a negative control by permuting the genotypes, and on data constructed to provide a positive control by sorting the genotype information in descending order of alleles at each gene locus (Table 3). The plot shows the balanced accuracies in a spider web form. Each category, ie, machine-learning method, has a separate axis, scaled from 0% to 100% balanced accuracy. The axes are arranged in a circle in 360° evenly, and the values of each series are connected with lines indicating the results obtained with either of the 3 data sets, each with a different color. The figure has been created using the R software package (version 3.4.1 for Linux; http://CRAN.R-project.org/) with the “radarchart” function provided in the library “fmsb” (Nakazawa M, https://cran.r-project.org/package=fmsb). NGS, next-generation sequencing.
Figure 6.
Figure 6.
Importance of single-gene loci among the TRPA1 (chromosome 8: X8) and TRPV1 (chromosome 17: X17) genotypes for the random-forests–based classification into the 2 capsaicin hypersensitization phenotype groups (Fig. 2). The stripchart shows the importance of each gene locus, measured as the mean decrease in the classification accuracy when the respective feature is omitted from random-forests building. The figure has been created using the R software package (version 3.4.1 for Linux; http://CRAN.R-project.org/). SNV, single-nucleotide variation.

Similar articles

Cited by

References

    1. Backes C, Harz C, Fischer U, Schmitt J, Ludwig N, Petersen BS, Mueller SC, Kim YJ, Wolf NM, Katus HA, Meder B, Furtwängler R, Franke A, Bohle R, Henn W, Graf N, Keller A, Meese E. New insights into the genetics of glioblastoma multiforme by familial exome sequencing. Oncotarget 2015;6:5918–31. - PMC - PubMed
    1. Bayes M, Price M. An essay towards solving a problem in the doctrine of chances. By the late rev. Mr. Bayes, F. R. S. Communicated by Mr. Price, in a Letter to John Canton, A. M. F. R. S. Philosophical Trans 1763;53:370–418.
    1. Bonferroni CE. Teoria statistica delle classi e calcolo delle probabilita. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze 1936;8:3–62.
    1. Borg I, Groenen P. Modern multidimensional scaling: theory and applications. New York: Springer, 2005.
    1. Boukalova S, Touska F, Marsakova L, Hynkova A, Sura L, Chvojka S, Dittert I, Vlachova V. Gain-of-function mutations in the transient receptor potential channels TRPV1 and TRPA1: how painful? Physiol Res 2014;63(suppl 1):S205–213. - PubMed