Data mining & decision making

The version you’re consulting is not final. This course description may change. The final version will be published on 1st June.

5.00 credits

30.0 h + 15.0 h

Teacher(s)

Saerens Marco;

Language

English
> French-friendly

Prerequisites

Artificial Intelligence, as covered by LINFO1361

Main themes

Foundations of Reinforcement Learning (RL)
Multi-armed bandits and exploration/exploitation
Markov Decision Processes (MDP)
Solving with dynamic programming
Monte Carlo methods
Temporal Difference Learning methods (Q-learning)
Deep Reinforcement Learning
Value function approximations (DQN and variants)
Policy gradient methods (REINFORCE, AC, PPO)
Monte Carlo Tree Search
Large Reasoning Models and RL from Human Feedback
Applications to games and simulated environments
Contemporary challenges, limitations, and perspectives of RL

Learning outcomes

At the end of this learning unit, the student is able to :

Given the learning outcomes of the "Master in Computer Science and Engineering" program, this course contributes to the development, acquisition and evaluation of the following learning outcomes:

INFO1.1-3
INFO2.1-4
INFO5.3-4
INFO6.1, INFO6.4, INFO6.5

Given the learning outcomes of the "Master [120] in Computer Science" program, this course contributes to the development, acquisition and evaluation of the following learning outcomes:

SINF1.M4
SINF2.1-4
SINF5.3-4
SINF6.1, SINF6.4, SINF6.5

Students completing this course successfully will be able to:

Model a problem in terms of Markov Decision Processes
Implement classical RL algorithms (Q-Learning, Monte Carlo, etc.)
Understand the challenges of exploration and value function approximation
Implement contemporary RL algorithms (DQN, REINFORCE, PPO, etc.)
Describe how Large Reasoning Models work and their use in RL from Human Feedback
Apply RL to simulated environments (games, control tasks)
Read, understand, and analyze scientific papers in the field of RL
Analyze the performance and limitations of the implemented approaches

Content

General introduction to RL (agent, environment, states, actions, rewards, policy, value functions, convergence).
Multi-armed bandits (Exploration/Exploitation, ε-greedy, upper confidence bound, softmax, Thompson sampling, Regrets).

Markov Decision Processes: formalism and dynamics (Markov property, stochastic vs deterministic policies, action-value functions, Bellman equation, optimality).
Solving with dynamic programming (policy evaluation, policy iteration, value iteration).

Monte Carlo methods (state-value and action-value estimation, convergence).
Temporal Difference Learning (Bootstrap, TD(0), variance, online learning).

Q-Learning algorithms.

Function approximation and Deep Q-Networks (gradient, nonlinear approximation, DQN).

Monte Carlo Tree Search and deep variants.
Advanced exploration (REINFORCE, Actor-Critic, Proximal Policy Optimization).

Introduction to Large Reasoning Models (LRMs) and RL from Human Feedback (RLHF) - Language Modeling, Direct Preference Optimization (DPO), supervised Fine-Tuning.

Applications to games and simulated environments with the open-source Gymnasium library.

Case studies (Atari, CartPole, LunarLander) and/or practical project on implementation and comparative analysis of methods.

Evaluation methods

Project (30%): design and implementation of an AI based on RL for a situation involving an opponent (stochastic and imperfect-information game). The project will take the form of a friendly competition among students.
Assignment 1 (10%): implementation of a classical RL algorithm.
Assignment 2 (10%): implementation of a deep RL algorithm.
Assignment 3 (10%): reading and critical analysis of a recent paper on RL.
Final exam (40%): the final exam will be comprehensive; it covers the entire material and is open-book.

Other information

Background / prerequisites :

LBIR1304 ou LFSAB1105 : a course on probability theory and mathematical statistics,
LBIR1200 ou LFSAB1101 : a course on linear and matrix algebra,
LFSAB1402 : a good Python programming course,
A course in multivariate calculus (mathematics).

Online resources

Available on Moodle

Bibliography

Some recommended reference books :

Alpaydin (2004), "Introduction to machine learning". MIT Press.
Bardos (2001), "Analyse discriminante. Application au risque et scoring financier. Dunod.
Bishop (1995), "Neural networks for pattern recognition". Clarendon Press.
Bishop (2006), "Pattern recognition and machine learning". Springer-Verlag.
Bouroche & Saporta (1983), "L'analyse des données". Que Sais-je.
Cornuéjols & Miclet (2002), "Apprentissage artificiel. Concepts et algorithmes". Eyrolles.
Duda, Hart & Stork (2001), "Pattern classification, 2nd ed". John Wiley & Sons.
Dunham (2003), "Data mining. Introductory and advanced topics". Prentice-Hall.
Greenacre (1984), "Theory and applications of correspondence analysis". Academic Press.
Han & Kamber (2005), "Data mining: Concepts and techniques, 2nd ed.". Morgan Kaufmann.
Hand (1981), "Discrimination and classification". John Wiley & Sons.
Hardle & Simar (2003), "Applied multivariate statistical analysis". Springer-Verlag. Disponible à http://www.quantlet.com/mdstat/scripts/mva/htmlbook/mvahtml.html
Hastie, Tibshirani & Friedman (2001), "The elements of statistical learning". Springer-Verlag.
Johnson & Wichern (2002), "Applied multivariate statistical analysis, 5th ed". Prentice-Hall.
Lebart, Morineau & Piron (1995), "Statistique exploratoire multidimensionnelle". Dunod.
Mitchell (1997), "Machine learning". McGraw-Hill.
Naim, Wuillemin, Leray, Pourret & Becker (2004), "Réseaux bayesiens". Editions Eyrolles.
Nilsson (1998), "Artificial intelligence: A new synthesis". Morgan Kaufmann.
Ripley (1996), "Pattern recognition and neural networks". Cambridge University Press.
Rosner (1995), "Fundamentals of biostatistics, 4th ed".Wadsworth Publishing Company.
Saporta (1990), "Probabilités, analyse des données et statistique". Editions Technip.
Tan, Steinbach & Kumer (2005), "Introduction to data mining". Pearson.
Theodoridis & Koutroumbas (2003), "Pattern recognition, 3th ed". Academic Press.
Therrien (1989), "Decision, estimation and classification". Wiley & Sons.
Venables & Ripley (2002), "Modern applied statistics with S. Springer-Verlag.
Webb (2002), "Statistical pattern recognition, 2nd ed". John Wiley and Sons.

Faculty or entity

> INFO

Programmes / formations proposant cette unité d'enseignement (UE)

Title of the programme

Sigle

Credits

Prerequisites

Learning outcomes

Master [120] in Data Science : Statistic

DATS2M

Master [120] in Chemical and Materials Engineering

KIMA2M

Master [120] in Civil Engineering

GCE2M

Master [120] in Biomedical Engineering

GBIO2M

Master [120] in Forests and Natural Areas Engineering

BIRF2M

Master [120] in Environmental Bioengineering

BIRE2M

Master [120] in Mechanical Engineering

MECA2M

Master [120] in Electrical Engineering

ELEC2M

Master [120] in Physical Engineering

FYAP2M

Master [120] in Chemistry and Bioindustries

BIRC2M

Master [120] in Computer Science and Engineering

INFO2M

Master [120] in Computer Science

SINF2M

Master [120] in Electro-mechanical Engineering

ELME2M

Master [120] in Mathematical Engineering

MAP2M

Master [120] in Data Science Engineering

DATE2M

Certificat d'université : Statistique et science des données (15/30 crédits)

STAT2FC

Master [120] in Agricultural Bioengineering

BIRA2M

Master [120] in Data Science: Information Technology

DATI2M

Master [120] in Energy Engineering

NRGY2M