Chapter 5: Shapley

Shapley values originate from classical game theory and aim to fairly devide a payout between players. In this section a brief explanation of Shapley values in game theory is given, followed by an adaption to IML resulting in the method SHAP.

§5.01: Introduction to Shapley Values


Shapley values originate from cooperative game theory and provide a method to fairly distribute a total payout among collaborating players. In machine learning, this concept is adapted to explain individual model predictions by treating features as “players” that cooperate to produce a prediction.

Shapley Values in Game Theory

Axioms of Fair Payouts

The Shapley value provides a fair payout \(\phi_j\) for each player \(j \in P\) and uniquely satisfies the following axioms for any value function \(v\) :

  1. Efficiency: The total payout \(v(P)\) is fully allocated to players:

    \[\sum_j \phi_j = v(P)\]
  2. Symmetry: Indistinguishable players \(j,k\) receive equal shares i.e. \(\phi_j = \phi_k\).
  3. Null Player: Players who contribute nothing, receive nothing.
  4. Additivity: For two separate games, with value functions \(v_1, v_2\), define a combined game with \(v(S) = v_1(S) + v_2(S) \forall S \subseteq P\). Then:
\[\phi_{j, v_1 + v_2} = \phi_{j, v_1} + \phi_{j, v_2}\]

i.e. Payout of combined game = payout of the two separate games


§5.02: Shapley Values for Local Explanations


An important thing to note here: For the calculation of marginal contribution, the value of all preceding features + value of feature j is taken from \(x^{(i)}\) and the remaining from a random point and it is compared against taking all feature values preceding j being taken from \(x^{(i)}\) and the rest including \(j^{th}\) feature value being taken from a random point.

Axioms for Fair Attribution

  1. Efficiency: Sum of Shapley values add up to the centered prediction: \(\sum_{1}^p \phi_j(x) = \hat{f}(x) - \mathbb{E}_x[\hat{f}(x)]\) i.e. all predictive contribution is fully distributed and accounted for among the features.
  2. Symmetry: Identical contributors receive equal value.
  3. Null Player: Irrelevant features receive 0 value i.e. if \(\hat{f}_{S \cup \{j\}}(x_{S \cup \{j\}}) = \hat{f}_S(x_S) \Rightarrow \phi_j = 0\).
  4. Additivity: Attributions are additive across models which enables combining shapley values model ensembles.

All in all, Shapley Values have a strong theoretical foundation from cooperative game theory, produce fair attribution, and also provide contrastive explanations by quantifying each features role in deviating from the average prediction. However they come at a huge computational cost and are not robust to correlated data.