![]() Furthermore, PIMP was used to correct RF-based importance measures for two real-world case studies. ![]() We apply our method to simulated data and demonstrate that (i) non-informative predictors do not receive significant P-values, (ii) informative variables can successfully be recovered among non-informative variables and (iii) P-values computed with permutation importance (PIMP) are very helpful for deciding the significance of variables, and therefore improve model interpretability. The P-value of the observed importance provides a corrected measure of feature importance. The method is based on repeated permutations of the outcome vector for estimating the distribution of measured importance for each variable in a non-informative setting. Results: In this work, we introduce a heuristic for normalizing feature importance measures that can correct the feature importance bias. Recently, it has been observed that RF models are biased in such a way that categorical variables with a large number of categories are preferred. However, in the past years effective estimators of feature relevance have been derived for highly complex or non-parametric models such as support vector machines and RandomForest (RF) models. ![]() ![]() Linear models are probably the most frequently used methods for assessing feature relevance, despite their relative inflexibility. Motivation: In life sciences, interpretability of machine learning models is as important as their prediction accuracy. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |