Advanced ML Chapter 1: Interpretable ML Introduction This chapter introduces the basic concepts of Interpretable Machine Learning. We focus on supervised learning, explain the different types of explanations, repeat the topics correlation and interaction. Chapter 2: Interpretable Models Some machine learning models are already inherently interpretable, e.g. simple LMs, GLMs, GAMs and rule-based models. These models are briefly summarized and their interpretation clarified. Chapter 3: Feature Effects Feature Effects indicate the change in prediction due to changes in feature values. This chapter explains the feature effects methods ICE curves, PDP and ALE plots. Chapter 4: Functional Decomposition This chapter focuses on understanding how ML models make predictions by breaking down their behavior into simpler, interpretable components. This is achieved through the concept of Functional Decomposition, with specific methods like Classical Functional ANOVA (fANOVA) and Friedman's H-Statistic. Chapter 5: Shapley Shapley values originate from classical game theory and aim to fairly devide a payout between players. In this section a brief explanation of Shapley values in game theory is given, followed by an adaption to IML resulting in the method SHAP. Supervised Learning Chapter 11: Advanced Risk Minimization This chapter revisits the theory of risk minimization, providing more in-depth analysis on established losses and the connection between empirical risk minimization and maximum likelihood estimation. Chapter 12: Multiclass Classification This chapter treats the multiclass case of classification. Tasks with more than two classes preclude the application of some techniques studied in the binary scenario and require an adaptation of loss functions. Chapter 13: Information Theory This chapter covers basic information-theoretic concepts and discusses their relation to machine learning. Chapter 15: Regularisation Regularization is a vital tool in machine learning to prevent overfitting and foster generalization ability. This chapter introduces the concept of regularization and discusses common regularization techniques in more depth. Chapter 16: Linear Support Vector Machine This chapter introduces the linear support vector machine (SVM), a linear classifier that finds decision boundaries by maximizing margins to the closest data points, possibly allowing for violations to a certain extent. Chapter 17: Nonlinear Support Vector Machines Many classification problems warrant nonlinear decision boundaries. This chapter introduces nonlinear support vector machines as a crucial extension to the linear variant. Chapter 18: Boosting This chapter introduces boosting as a sequential ensemble method that creates powerful committees from different kinds of base learners." Reinforcement Learning Chapter 1: Introduction & Multi-Armed Bandits This chapter introduces the fundamental concepts of Reinforcement Learning including its key characteristics of trial-and-error search and delayed rewards. It also introduces Multi Armed Bandits, exploration-exploitation tradeoffs, and various methods for action-value estimation Chapter 2: Finite Markov Decision Processes This chapter explores the fundamental concepts of Markov Decision Processes (MDPs) covering the agent-environment interface, goals and rewards, returns and episodes, and policies and value functions. Chapter 4: Temporal-Difference Learning This chapter covers Temporal-Difference (TD) learning methods that combine ideas from Monte Carlo and dynamic programming, enabling agents to learn directly from raw experience without a model of the environment. Chapter 5: n-step Bootstrapping This section explains n-step bootstrapping techniques, which generalize TD learning by updating value estimates using returns accumulated over multiple steps, balancing bias and variance in learning. Chapter 6: Function Approximation, Deep Q-Networks, Expected SARSA This chapter discusses function approximation methods for scaling RL to large state spaces, including Deep Q-Networks (DQN) for learning value functions with neural networks and the Expected SARSA algorithm for stable policy evaluation. Chapter 7: Policy Gradient Algorithms, REINFORCE, Actor-Critic Algorithms, DPG, Hierarchical RL This section introduces policy gradient methods for directly optimizing policies, detailing the REINFORCE algorithm, Actor-Critic frameworks, Deterministic Policy Gradient (DPG), and approaches to hierarchical reinforcement learning for complex task decomposition. Large Language Models Large Language Models Transformers, Attention, Positional Encoding, BERT, BART, GPT, Pre-Training & Finetuning, Decoding Strategies, Tokenization, Data, Fast Attention Mechanisms, LoRA, Fast Inference Mechanism.