We are opening several reinforcement learning position at Inria Scool. Please, take also a look at this page.


  1. One 18m postdoc position about recommender systems in the context of the Pl@ntNet Inria research project. The focus is on estimation/improvement of annotation expert users by means of contextual multi-armed bandits approaches, with a unique opportunity to consider real data. Please apply here. Preferred starting date is early spring 2024.

Please contact O-A. Maillard by email with 3 of your main publications, CV, motivation letter and recommendation letter. The positions can be filled as soon as possible.


Within the Chaire of Artificial Intelligence on Reinforcement Learning (AppRenf project, R-PILOTE-19-004-APPRENF), we are opening a fully-funded PhD position starting next fall 2022.  The topic is about Real-life Challenges for Reinforcement Learning. The proposal is part of the Fondation I-SITE ULNE within the project PILOTE from cluster HumAIn@Lille. This is a PhD in Machine Learning, more specifically in Reinforcement Learning. See this document for further details. To apply, please contact O-A. Maillard by email with CV, motivation letter.

Interns (summer 2023)

Below is a list of (funded) internships proposals, please contact me in case you are interested. We expect students with a solid mathematical background specifically in statistics, information theory and/or dynamical systems. I only sketch topics below for obvious reasons, please contact me for details.

  • Risk in sequential decision making, within the Three-Risk-Proof Sequential Decision Making (3R-SDM) project, linked to Inrae-Inria call on Environmental Risks (filled)
  • Bandit tools for MDP theory: In particular we want to revisit LSPI, MCTS, VI  adapting modern optimal bandit strategies (filled)
  • Revisiting Regression Trees, Random Forest and variable selection with multi-armed bandits for finite-time error bound handling.
  • Use-case: Analysis of on-farm sequential data for decision making and recommender systems. This is a unique opportunity to analyse state-action-reward trajectories coming from real agriculture experiments.
  • Farm-gym v2: We are continuing the development of the atari of farming, fully-oriented towards RL.

These are intended for Master 2 or outstanding Master 1 students, and generally open the possibility to start a PhD later on. If you are interested, go ahead and contact me directly.


In case you want to apply for PhD, I strongly encourage you to read (a substantial part of) the following books and lecture notes:


Lecture Notes


