Permutation importance

Permutation importance is a method for measuring variable importance by randomly shuffling the values of a single feature and measuring the resulting degradation in model performance. The logic is brutal and elegant: if a feature is genuinely important to the model, then breaking its relationship with the target by permuting its values should cause a sharp increase in prediction error. If the feature is irrelevant, the permutation should have little effect. The method is model-agnostic in principle but most commonly applied to tree-based ensembles such as random forests, where it is computed efficiently on the out-of-bag samples without requiring a separate validation set.

The method has a known and often ignored vulnerability: when features are correlated, permuting one feature may simply transfer predictive power to its correlated partners, causing the importance score to be systematically underestimated. A feature that is genuinely causal but collinear with another feature may appear unimportant, while a feature that is merely a proxy may appear dominant. This is not a technical bug but a conceptual limitation: permutation importance measures the model's dependence on a feature, not the feature's causal relevance to the target. The two are conflated at the user's peril, and the field's casual use of importance scores as explanatory tools is a recurring epistemic hazard in applied machine learning.

Permutation importance is closely related to feature selection and is often compared to Shapley values, though the two methods rest on different theoretical foundations. While permutation importance measures the marginal contribution of a feature by destroying its signal, Shapley values distribute the model's prediction among features according to cooperative game theory. The choice between them is not merely technical; it reflects a deeper disagreement about what it means for a feature to be 'important.'