CNN variable importance using permutation

CNN permutation variable importance. The permutation importance measured by the decrease in a model score (i.e., Mean Decrease Accuracy (MDA), Mean Decrease in RMSE) when a variable is randomly shuffled n times.

CNN_varIMP_permute(optmodel, feature_names = NULL, train_y = NULL, train_x = NULL, type = c("difference", "ratio"), nsim = 1, sample_size = NULL, sample_frac = NULL, verbose = FALSE, progress = "none", parallel = FALSE, paropts = NULL, ...)

Arguments

optmodel	The optimal model used to estimate variable importance
feature_names	The names of the variables
train_y	The Y variable (dependent variable) used in regression
train_x	The independent variable dataset
type	Type of comparison "difference" or "ratio"
nsim	number of permutations
sample_size	The sample size used to do permutation feature importance
sample_frac	The sample fraction/proportion
verbose	whether print the progress or warnings
progress	Show progress
parallel	whether using parallel computation or not

Details

In this implementation, the best model is determined and the orignal variable metrics are used as the baseline. Then the permutation variable performance metrics are tested using the best model as the training set. This procedure breaks the relationship between the variable and the target, thus the drop in the model score is indicative of how much the model depends on the variable.

Value

Return a list of scores, including CNN model decrease in accuracy, the permutation metrics, and the baseline metrics.

References

Fisher, Aaron, Cynthia Rudin, and Francesca Dominici. “Model Class Reliance: Variable importance measures for any machine learning model class, from the ‘Rashomon’ perspective.” http://arxiv.org/abs/1801.01489 (2018).

Author

qinxinghu@gmail.com

Arguments

Details

Value

References

Author

Note

See also