Transcription-based prediction of response to IFNbeta using supervised computational methods

Sergio E Baranzini¹, Parvin Mousavi, Jordi Rio, Stacy J Caillier, Althea Stillman, Pablo Villoslada, Matthew M Wyatt, Manuel Comabella, Larry D Greller, Roland Somogyi, Xavier Montalban, Jorge R Oksenberg

Affiliations

PMID: 15630474
PMCID: PMC539058
DOI: 10.1371/journal.pbio.0030002

Transcription-based prediction of response to IFNbeta using supervised computational methods

Sergio E Baranzini et al. PLoS Biol. 2005 Jan.

. 2005 Jan;3(1):e2.

doi: 10.1371/journal.pbio.0030002. Epub 2004 Dec 28.

Authors

Affiliation

¹ Department of Neurology, School of Medicine University of California, San Francisco, USA. sebaran@cgl.ucsf.edu <sebaran@cgl.ucsf.edu>

PMID: 15630474
PMCID: PMC539058
DOI: 10.1371/journal.pbio.0030002

Abstract

Changes in cellular functions in response to drug therapy are mediated by specific transcriptional profiles resulting from the induction or repression in the activity of a number of genes, thereby modifying the preexisting gene activity pattern of the drug-targeted cell(s). Recombinant human interferon beta (rIFNbeta) is routinely used to control exacerbations in multiple sclerosis patients with only partial success, mainly because of adverse effects and a relatively large proportion of nonresponders. We applied advanced data-mining and predictive modeling tools to a longitudinal 70-gene expression dataset generated by kinetic reverse-transcription PCR from 52 multiple sclerosis patients treated with rIFNbeta to discover higher-order predictive patterns associated with treatment outcome and to define the molecular footprint that rIFNbeta engraves on peripheral blood mononuclear cells. We identified nine sets of gene triplets whose expression, when tested before the initiation of therapy, can predict the response to interferon beta with up to 86% accuracy. In addition, time-series analysis revealed potential key players involved in a good or poor response to interferon beta. Statistical testing of a random outcome class and tolerance to noise was carried out to establish the robustness of the predictive models. Large-scale kinetic reverse-transcription PCR, coupled with advanced data-mining efforts, can effectively reveal preexisting and drug-induced gene expression signatures associated with therapeutic effects.

PubMed Disclaimer

Figures

**Figure 1. Nonsupervised Two-Way Hierarchical Clustering of Samples at T = 0**
A clear aggregation of samples cannot be seen by this technique. The first column indicates the type of responder to which each sample belongs (red, good; blue, poor).

**Figure 2. Accuracy Ranges of the Three-Gene Predictive Model of IFNβ Response**
After the initial data split into training and test sets, using IBIS on the training set only, nine best-performing triplets were identified. The triplet of *Caspase 2, Caspase 10,* and *FLIP* resulted in an accuracy rate of 86% correct prediction on the blind test set resulting from the original split. To minimize the effect of fortuitous initial data division in the accuracy outcome, an extra 100 data splits were performed as a coarse approximation of the possible ranges of accuracies in which this gene triplet could result. A histogram of prediction accuracy over the 100 trials for the gene triplet composed of *Caspase 2, Caspase 10,* and *FLIP* is shown as an example of classification and prediction of response to IFNβ at T = 0. A red Gaussian curve encompasses the distribution, where the mean prediction accuracy was 87.9%, with a maximum of 100% (in 11 cases) and a minimum of 64.3% (in two cases). The broken blue line indicates the tenth percentile (78.6%). No major differences were found when we performed the same classification/prediction strategy in 500 random splits of the data.

**Figure 3. Best-Scoring Gene Triplet by F-Test Analysis**
Notably, as observed with IBIS, *Caspase 10* was also the single best discriminant (p = 1.87 × 10⁻⁴) variable, but the second and third best scoring genes by F-test (IL12Rb2, IL4Ra) did not seem to add any significant predictive power. The mean prediction accuracy for the test set of samples was 65.6% (tenth percentile, 57.1%), well below that observed for the triplet derived from IBIS *(Caspase 2, Caspase 10,* and *FLIP)* shown in Figure 2. This suggests that F-test could efficiently capture individual linear separators but cannot identify and prioritize the nonlinear combinations of genes discovered by IBIS that ultimately provide the most predictive accuracy and robustness.

**Figure 4. Training Dataset Performance of the Three Genes from the Top Predictive Model of IFNβ Response**
One-, two-, and three-dimensional IBIS searches were conducted independently on the same training dataset. Each chart shows a two-colored background, corresponding to regions predictive of good response (red) and poor response (blue). Each colored dot corresponds to an individual sample (red, good responder; blue, poor responder). (A–C) One-dimensional IBIS predictive models. High values of *Caspase 10* are associated with poor response according to a linear relationship. In contrast, *Caspase 2* levels are associated with poor response at intermediate values, suggesting a nonlinear relationship. *FLIP* expression is associated with good responders at low values, again depicting a linear relationship. The highest cross-validation accuracy score for a single gene predictor was 73% *(Caspase 10)*. (D–F) Two-dimensional IBIS predictive models. Each of the three possible pairs of this classifier was tested. Linear and nonlinear combinatorial predictive relationships were revealed, specifically, a nonlinear predictive relationship associating poor response with high values of *Caspase 10* and intermediate values of *Caspase 2,* a nonlinear relationship associating good response with high values of *FLIP* and either low or high (but notintermediate) values of *Caspase 2,* and a linear relationship associating poor response with low values of *FLIP* and high values of *Caspase 10*. The highest cross-validation score was obtained for the *Caspase 2/Caspase 10* pair according to a nonlinear, quadratic distribution (85% accuracy). (G) Three-dimensional IBIS predictive model. The shapes identified in the 1D and 2D distributions were optimized by the 3D model, providing a better separation of good and poor responders.

**Figure 5. Test Dataset Performance of the Top Three-Gene Predictive Model of IFNβ Response**
The same probability model generated from the training dataset (see Figure 4G) provides the background shading of volumes predictive of good response (red) and poor response (blue). Three samples are identified with arrows and followed along different graphical representations. (A and B) The two rotations of the full 3D model show that all good responder samples are correctly classified. (C) Projection of full model onto one of the possible 2D surfaces is provided as an aid to visualization. (D–F) Two-dimensional IBIS predictive models. Three samples are identified with arrows and followed along different graphical representations. If prediction was performed in only two dimensions, a higher number of misclassifications would have occurred. For example, the 2D model built using only *Caspase 2/FLIP* (D), could not resolve the good responding sample identified by a cyan arrow, whereas it correctly resolves the good responding sample shown by the orange arrow. The model built using *Caspase 10/FLIP* (E), in contrast, acts oppositely and can resolve the good responding sample shown by the cyan arrow and not the sample shown by the orange arrow. Both these sample are correctly resolved the 2D model built using *Caspase 2/Caspase 10* (F); however, this model is unable to resolve the poor responding sample identified by the yellow arrow, whereas one of the previous models (E) was able to do this. As demonstrated in the full 3D model view from (A) and (B), as well as the projection of model (C), all the labeled poor and good responding patients are correctly classified. Although 2D models show high predictive capabilities, all three genes are needed to increase the classification accuracy of the IBIS model.

**Figure 6. Characteristic Gene Expression Profiles of Good and Poor Responders to IFNβ over Time**
(A) An unsupervised hierarchical clustering representation of the weighted difference between the average expression of good and poor responders. For each gene, the obtained differences were log normalized and multiplied by the F-statistic from an ANOVA (responder effect) run previously (shown in [B]). The “heat” colored bar represents the absolute value of this difference. With the exception of *MX1* (indicated by an arrow), all genes showing a significant difference in expression between the two groups of patients were automatically arranged in only two clusters (framed in blue). (B) List of all genes showing a significant responder effect along with their F-statistic and p-values. Genes that were part of any triplet showing more than 80% prediction accuracy at T = 0 are shown in bold. (C) A continuous representation of the longitudinal average expression of two representative genes for good (^) and poor (•) responders. *TRADD* shows two widely parallel curves, indicative of a significant difference in the expression averages, correlating with its profile (#) observed in the clustering shown in (A). In contrast, *GATA* 3 displays two almost overlapping curves, consistent with its shading (*) in the clustering in (A).

**Figure 7. IFNβ-Induced Changes in Gene Expression over Time**
(A) An unsupervised hierarchical clustering representation of the weighted difference in gene expression at each time point versus baseline. For each gene, the obtained differences were log normalized and multiplied by the F-statistic from an ANOVA (time effect) run previously (shown in [B]). The “heat” colored bar represents the absolute value of this difference. With the exception of *IFNAR1* (arrow), all genes showing a significantly different expression in at least one time point with respect to baseline were arranged in the same cluster (framed in blue). (B) List of all genes showing a significant time effect along with their F-statistic and p-values. Genes that were part of any triplet showing more than 80% prediction accuracy at T = 0 are in bold. (C) A continuous representation of the longitudinal average expression of two representative genes over all samples. *MX1* (^) shows a marked departure from T = 0 and remains elevated for the rest of the observed period. This correlates well with the shading (#) displayed in the clustering shown in (A). In contrast, *IRF6* (•) displays an almost flat curve, consistent with its color in the clustering (*).

See this image and copyright information in PMC

References

1. Samuel CE, Knutson GS. Mechanism of interferon action: Human leukocyte and immune interferons regulate the expression of different genes and induce different antiviral states in human amnion U cells. Virology. 1983;130:474–484. - PubMed
1. Tompkins WA. Immunomodulation and therapeutic effects of the oral use of interferon-alpha: Mechanism of action. J Interferon Cytokine Res. 1999;19:817–828. - PubMed
1. Zhang X, Xu HT, Zhang CY, Liu JJ, Liu CM, et al. Immunomodulation of human cytomegalovirus infection on interferon system in patients with systemic lupus erythematosus. J Tongji Med Univ. 1991;11:126–128. - PubMed
1. Jacobs L, Salazar AM, Herndon R, Reese PA, Freeman A, et al. Multicentre double-blind study of effect of intrathecally administered natural human fibroblast interferon on exacerbations of multiple sclerosis. Lancet. 1986;2:1411–1413. - PMC - PubMed
1. Arnason BG. Interferon beta in multiple sclerosis. Clin Immunol Immunopathol. 1996;81:1–11. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

1R01 AI42911/AI/NIAID NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Transcription-based prediction of response to IFNbeta using supervised computational methods

Affiliation

Transcription-based prediction of response to IFNbeta using supervised computational methods

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources