CNN_varIMP_permute.Rd
CNN permutation variable importance. The permutation importance measured by the decrease in a model score (i.e., Mean Decrease Accuracy (MDA), Mean Decrease in RMSE) when a variable is randomly shuffled n times.
CNN_varIMP_permute(optmodel, feature_names = NULL, train_y = NULL, train_x = NULL, type = c("difference", "ratio"), nsim = 1, sample_size = NULL, sample_frac = NULL, verbose = FALSE, progress = "none", parallel = FALSE, paropts = NULL, ...)
optmodel | The optimal model used to estimate variable importance |
---|---|
feature_names | The names of the variables |
train_y | The Y variable (dependent variable) used in regression |
train_x | The independent variable dataset |
type | Type of comparison "difference" or "ratio" |
nsim | number of permutations |
sample_size | The sample size used to do permutation feature importance |
sample_frac | The sample fraction/proportion |
verbose | whether print the progress or warnings |
progress | Show progress |
parallel | whether using parallel computation or not |
In this implementation, the best model is determined and the orignal variable metrics are used as the baseline. Then the permutation variable performance metrics are tested using the best model as the training set. This procedure breaks the relationship between the variable and the target, thus the drop in the model score is indicative of how much the model depends on the variable.
Return a list of scores, including CNN model decrease in accuracy, the permutation metrics, and the baseline metrics.
Fisher, Aaron, Cynthia Rudin, and Francesca Dominici. “Model Class Reliance: Variable importance measures for any machine learning model class, from the ‘Rashomon’ perspective.” http://arxiv.org/abs/1801.01489 (2018).
qinxinghu@gmail.com