Open positions

We are opening several reinforcement learning position at Inria Scool. Please, take also a look at this page.


  1. One postdoc is about investigating the challenges of Reinforcement Learning  for Real-Life systems. The challenges are inspired from application of sequential decision making to fields like agroecology  and healthcare. This includes this includes advancing questions related to Causal-RL, Contextual-RL and Robust-RL. The postdoc can work more on the theory side or more on the applied side acording ot his/her own taste. On top of traditional RL theory, we are expected to investigate an exciting line of research with formalization and modeling of questions that go beyond mainstream RL.
  2. Another open postdoc position is within the Chaire of Artificial Intelligence AppRenf, and consists in advancing core Reinforcement Learning.
    A solid background in Reinforcement Learning or Dynamical systems and Statistics is required.

Please contact O-A. Maillard by email with 3 of your main publications, CV, motivation letter and recommendation letter. The positions can be filled as soon as possible.


Within the French national Chaire of Artificial Intelligence on Reinforcement Learning (AppRenf project, R-PILOTE-19-004-APPRENF), we are opening a fully-funded PhD position starting next fall 2022.  The topic is about Real-life Challenges for Reinforcement Learning. The proposal is part of the Fondation I-SITE ULNE within the project PILOTE from cluster HumAIn@Lille. This is a PhD in Machine Learning, more specifically in Reinforcement Learning. See this document for further details. To apply, please contact O-A. Maillard by email with CV, motivation letter.

Interns (summer 2022)

Below is a list of (funded) internships proposals, please contact me in case you are interested. We expect students with a solid mathematical background specifically in statistics, information theory and/or dynamical systems. I only sketch topics below for obvious reasons, please contact me for details.

  • [URGENT] Risk in sequential decision making, within the Three-Risk-Proof Sequential Decision Making (3R-SDM) project, linked to Inrae-Inria call on Environmental Risks.
  • Bandit tools for MDP theory: In particular we want to revisit LSPI, MCTS, VI  adapting modern optimal bandit strategies.
  • Revisiting Regression Trees, Random Forest and variable selection with multi-armed bandits for finite-time error bound handling.
  • Use-case: Analysis of on-farm sequential data for decision making and recommender systems. This is a unique opportunity to analyse state-action-reward trajectories coming from real agriculture experiments.
  • Farm-gym v2: We are continuing the development of the atari of farming, fully-oriented towards RL.

These are intended for Master 2 or outstanding Master 1 students, and generally open the possibility to start a PhD later on. If you are interested, go ahead and contact me directly.


In case you want to apply for PhD, I strongly encourage you to read (a substantial part of) the following books and lecture notes:


Lecture Notes


Les commentaires sont fermés.