Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 14;21(1):307.
doi: 10.1186/s12859-020-03622-2.

Conditional permutation importance revisited

Affiliations

Conditional permutation importance revisited

Dries Debeer et al. BMC Bioinformatics. .

Abstract

Background: Random forest based variable importance measures have become popular tools for assessing the contributions of the predictor variables in a fitted random forest. In this article we reconsider a frequently used variable importance measure, the Conditional Permutation Importance (CPI). We argue and illustrate that the CPI corresponds to a more partial quantification of variable importance and suggest several improvements in its methodology and implementation that enhance its practical value. In addition, we introduce the threshold value in the CPI algorithm as a parameter that can make the CPI more partial or more marginal.

Results: By means of extensive simulations, where the original version of the CPI is used as the reference, we examine the impact of the proposed methodological improvements. The simulation results show how the improved CPI methodology increases the interpretability and stability of the computations. In addition, the newly proposed implementation decreases the computation times drastically and is more widely applicable. The improved CPI algorithm is made freely available as an add-on package to the open-source software R.

Conclusion: The proposed methodology and implementation of the CPI is computationally faster and leads to more stable results. It has a beneficial impact on practical research by making random forest analyses more interpretable.

Keywords: Conditional permutation importance; R; Random forest.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Permutation scheme for the original PI (left) and for the CPI (right). In the permutation scheme of the original PI (left) the values of Xk are permuted against both Y and Z(−k). In the permutation scheme of the CPI (right) the values of Xk are permuted against Y conditionally on the values of Z(−k)
Fig. 2
Fig. 2
Tree-growing (left) and conditional permutation (CPI; right) predictor space partitioning. In the tree-growing partitioning (left), only one split completely splits the predictor space, all the other splits are conditional on the previous split(s). In the partitioning that is used for permutation scheme in the CPI, where the predictor of interest is Xk=X1 and Z(k)(s)=X2 (right), the splits with respect to X2 are extended to completely split the predictor space (dashed lines) while the splits with respect to X1 are ignored (gray lines)
Fig. 3
Fig. 3
Computation speed: party vs. permimp implementation. Fifteen RFs (using ntree =1000 and mtry =10) were fit to a peptide-binding data set [5, 36]. The data set includes 310 observations of 105 predictors for a binary outcome. CPI according the old and the current party implementation as well as the permimp implementation were computed for two threshold values: s=.20 and s=.95. The average computation speed for the three implementations are given as respectively blue and red bars
Fig. 4
Fig. 4
Non-linear dependencies: party vs. permimp implementation. Data sets were sampled with five, nine or seventeen uniformly distributed (min = -3, max = 3) predictors. An additional predictor was created by squaring one of the uniform predictors, so that all but two predictors were independent, and two showed a perfect quadratic relation. This resulted in two quadratically related predictors plus either four, eight or sixteen independent predictors. Sample size was either N=500,1000, or 2500, and either no, half or all the predictors had a linear impact on the continuous outcome variable. The dependence tests within the party and the permimp implementation (i.e., a χ2-test based on the tree-growing split points) were applied only to the two predictors with the perfect quadratic relation. The proportion of p-values lower than.05 within the party and permimp implementation are indicated in blue and pink respectively. The dashed line corresponds to a proportion of.05
Fig. 5
Fig. 5
Sample size dependence: party vs. permimp implementation. Data sets were sampled with pairwise correlations of either ρ=.00,.05,.10, or.25 between 18 normally distributed predictors. Sample size was either N=500,1000, or 2500, and either no, half or all the predictors had a linear impact on the continuous outcome variable. The dependence tests within the party (correlation t-test) and the permimp implementation (χ2-test based on the tree-growing split points) were applied. The proportion of p-values lower than.05 within the party and permimp implementation are indicated in blue and pink respectively. The dashed line corresponds to a proportion of.05
Fig. 6
Fig. 6
Stability of the CPI: party vs. permimp implementation. Data were generated according to a linear model with 12 uncorrelated normally distributed predictors, the regression coefficients were 5, 5, 2, 0, -5, -5, -2, 0, 0, 0, 0, and 0. In each of the 1000 replications 1000 observations were sampled, and a regression RF with 1000 trees was fit, after which the CPI was computed with a threshold value of s=0.5. The mean CPI (lines) as well as the region between the first and third quartile (shaded area) of the CPI distribution across the 1000 replications are depicted for every predictor, both for the party and the permimp implementation of the CPI, in blue and red, respectively. As a reference, the unconditional PI, which corresponds to the CPI with a threshold value of s=1 regardless of the implementation, is depicted in black. Because all predictors are independent, the PI and the CPI have demonstrate the same pattern

References

    1. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
    1. Breiman L, Cutler A. Technical report: Random forests manual v4: UC Berkeley; 2003. https://www.stat.berkeley.edu/~breiman/Using_random_forests_v4.0.pdf.
    1. Ishwaran H, et al. Variable importance in binary regression trees and forests. Electron J Stat. 2007;1:519–37.
    1. Ishwaran H, Kogalur UB, Gorodeski EZ, Minn AJ, Lauer MS. High-dimensional variable selection for survival data. J Am Stat Assoc. 2010;105(489):205–17.
    1. Strobl C, Kneib T, Augustin T, Zeileis A. Conditional variable importance for random forests. BMC Bioinformatics. 2008;9(1):307. - PMC - PubMed

LinkOut - more resources