Chapter 11: Advanced Risk Minimization

This chapter revisits the theory of risk minimization, providing more in-depth analysis on established losses and the connection between empirical risk minimization and maximum likelihood estimation.

§11.01: Risk Minimisers


(Universally) Consistent Learners

Optimal Constant Models


§11.02:Pseudo-Residuals



§11.03: L2 Loss

Risk Minimiser

Optimal Constant Model


11.06: 0-1 Loss:


Risk Minimiser of 0-1 Loss:

Decision Boundary of Bayes Optimal Classifier

§11.07:Bernoulli Loss:


Risk Minimiser for probabilistic classifier

Risk Minimiser for scores classifier


§11.08: Deep Dive into Logistic Regression

https://github.com/slds-lmu/lecture_sl/raw/main/slides-pdf/slides-advriskmin-logreg-deepdive.pdf


§11.09: Brier Score


The brier score can be thought of as $L_2$ loss on probabilities $\pi(x)$ i.e.

\[L(y, \pi(x)) = (y - \pi(x))^2\]

The risk minimiser of the binary Brier Score is given by:

\[\pi^*(x) = P(Y=1|X=x)\]

§11.12: MLE vs ERM 1


\[\min_\theta -l(\theta) = -\Sigma_{i=1}^n \log(p(y^{(i)} | x^{(i)}, \theta))\]

§11.13: MLE vs ERM 2


§11.14: Properties of Loss Functions


§11.15: Bias-Variance Decomposition



§11.16: Bias-Variance Decomposition Deep Dive

https://github.com/slds-lmu/lecture_sl/raw/main/slides-pdf/slides-advriskmin-bias-variance-decomposition-deepdive.pdf